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NUCLEIC ACID SEQUENCES TO PROTEINS INVOLVED IN TOCOPHEROL 

SYNTHESIS 

INTRODUCTION 

This application claims the benefit of the filing date of US. Application Serial Number 
09/549,848, filed April 14, 2000. 

TECHNICAL FIELD 
The present invention is directed to nucleic acid and amino acid sequences and 
constructs, and methods related thereto. 



15 . .. BACKGROUND 

Isoprenoids are ubiquitous compounds found in all living organisms. Plants synthesize a 
diverse array of greater than 22,000 Isoprenoids (Connolly and Hill (1992) Diciionary of 
Terpenoids^ Chapman and Hall, New York, NY). In plants, isoprenoids play essential roles in 
particular cell functions such as production of sterols, contributing to eukaryotic membrane 

20 architecture, acyclic polyprenoids found in the side chain of ubiquinone and plastoquinone, 

growth regulators like abscisic acid, gibberellins, brassinosteroids or the photosynthetic pigments 
chlorophylls and carotenoids. Although the physiological role .of other plant isoprenoids is less 
evident, like that of the vast array of secondary metabolites, some are known to play key roles 
mediating the adaptative responses to different environmental challenges. In spite of the 

25 remarkable diversity of structure and function, all isoprenoids originate from a single metabolic 
precursor, isopenlenyl diphosphate (IPP) (Wright, (\%\)AnniL Rev. Biochem. 20:525-546; and 
Spurgeon and Porter, ( 1 98 1 ) in Biosynthesis of Isoprenoid Compounds .. Porter and Spurgeon eds 
(John Wiley, New York) Vol. 1, ppl-46), 

A number of unique and interconnected biochemical pathways derived from the 

30 isoprenoid pathway leading to secondary metabolites, including tocopherols, exist in chloroplasts 
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of higher plants. Tocoplierols not only perFonn vital functions in plants, but are also important 
from mamn-alian nutritional perspectives. In plaslids, tocopherols account for up to 40% of the 
total quinonc pool. 

Tocopherols and locotricnols (unsaturated tocopherol derivatives) arc well known 
5 antioxidants, and play an important role in protecting cells from Free radical damage, and in the 
prevention oFmany diseases, including cardiac disease, cancer, cataracts, retinopathy, 
Alzheimer's disease, and neurodegeneration, and have been shown to have beneficial effects on ... 
symptoms of arthritis, and in anti-aging. Vitamin E is used in chicken feed for improving the 
shelf life, appearance, flavor, and oxidative stability of meat, and to transfer tocols from feed lo 

10 eggs. Vitamin E has been shown to be essential for normal reproduction, improves overall 

performance, and enhances immunocompetence in livestock animals. Vitamin E supplement in 
animal feed also imparts oxidative stability to milk products. 

The demand for natural tocopherols as supplements has been steadily growing at a rate of 
10-20% for the past tiiree years. At present, the demand exceeds the supply for natural 

15 • tocopherols, which are known to be more biopotent than racemic mixtures of synthetically 

produced tocopherols. Naturally occurring tocopherols are all rf-stereomers, whereas synthetic a- 
tocopherol is a mixture of eight rf,/-a-tocopherol isomers, only one of which (12.5%) is identical 
to the natural rf-a-tocopherol. Natural rf-a-tocopherol has the highest vitamin E activity (1.49 
lU/mg) when compared to other natural tocopherols or tocotrienols. The synthetic a-tocopherol 

2 0 has a vitamin E activity of 1 .1 lU/mg. In 1 995, the worldwide market for raw refined 

tocopherols was $1020 million; synthetic materials comprised 85-88% of the rnarket, the 
remaining 12-15% being natural materials. The best sources of natural tocopherols and 
tocotrienols are vegetable oils and grain products. Currently, most of the natural Vitamin E is 
produced from y-tocopherol derived from soy oil processing, which is subsequently converted to 

25 a-tocopherol by chemical modification (a-tocopherol exhibits the greatest biological activity). a 
Methods of enhancing the levels of tocopherols and tocotrienols in plants, especially levels 
of the more desirable compounds that can l)e used directly, without chemicaj modification, would be 
useful to the art as such molecules exhibit t>etter functionality and biovailability. 
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In addition, methods for the increased produclion oPolhcr isuprenoid derived compounds 
a hosl plant cell is desirable. Furthcmiorc, melhods for the produclion of particular isoprenoid 
compounds in a hosl plant cell is also needed. 

SUMMARY OF THE INVENTION 

The present invention is directed to sequences to proteins involved in tocopherol 
syntliesis. The polynucleotides and polypeptides of the present invention include those derived 
from prokaryotic and eukaryotic sources. 

Thus, one aspect of the present invention relates to prenyltransferase, and in particular to 
isolated polynucleotide sequences encoding prenyltransferase proteins and polypeptides related 
thereto. In particular, isolated nucleic acid sequences encoding prenyltransferase proteins from 
bacterial and plant sources are provided. 

In another aspect, the present invention provides isolated polynucleotide sequences 
encoding tocopherol cyclase, and polypeptides related thereto. In particular, isolated nucleic acid 
sequences encoding tocopherol cyclase proteins from bacterial and plant sources are provided. 

Another aspect of the present invention relates to oligonucleotides which include partial 
or complete prenyltransferase or tocopherol cyclase encoding sequences. 

It is also an aspect of the present invention to provide recombinant DNA constructs v^^hich 
can be used for transcription or transcription and translation (expression) of prenyltransferase or 
tocopherol cyclase. In particular, constructs are provided which are capable of transcription or 
transcription and translation in host cells. 

In another aspect of the present invention, methods are provided for production of 
prenyltransferase or tocopherol cyclase in a host cell or progeny thereof. In particular, host cells 
are transformed or transfected with a DNA construct which can be used for transcription or 
transcription and translation of prenyltransferase or tocopherol cyclase. The recombinant cells 
which contain prenyltransferase or tocopherol cyclase are also part of the present invention. 

In a further aspect, the present invention relates to melhods of using polynucleotide and 
polypeptide sequences to modify the tocopherol content of host cells, particularly in host plant 
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cells. Plant cells having such a modified tocopherol content arc also contemplated herein. 
Methods and cells in which bolh prenyllranslerjisc and tocopherol cyclase arc expressed in a host 
cell arc also part of the present invention. 

The modillcd plants, seeds and oils obtained by the expression ofthe prcnyltransferase or 
5 tocopherol cyclase are also considered part ofthe invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides an amino acid sequence alignment between ATPT2, ATPT3, ATPT4, 
10 ATPT8, and ATPTl 2 are performed using ClustalW. 

Figure 2 provides a schematic picture of the expression construct pCGN 1 0800. 

Figure 3 provides a schematic picture of the expression construct pCGN 10801. 
, Figure 4 provides a schematic picture ofthe expression construct pCGNl0803. 

Figure 5 provides a schematic picture ofthe construct pCGNl 0806. 
15 • : Figure 6 provides a schematic picture of the construct pCGN10807. 

Figure 7 provides a schematic picture of the construct pCGNI 0808. 

Figure 8 provides a schematic picture ofthe expression construct pCGN 10809. 

Figure 9 provides a schematic picture of the expression construct pCGNlOSlO. 

Figure 10 provides a schematic picture ofthe expression construct pCGNlOSll. 
2 0 Figure 1 1 provides a schematic picture of the expression construct pCGNl 08 12. 

Figure 12 provides a schematic picture ofthe expression construct pCGN10813. 

Figure 1 3 provides a schematic picture of the expression construct pCGNl 08 14. 

Figure 14 provides a schematic picture ofthe expression construct pCGN10815. 

Figure 1 5 provides a schematic picture of the expression construct pCGN 1 08 1 6. 
2 5 Figure 1 6 provides a schematic picture of the expression construct pCGN 1 08 1 7. 

Figure 1 7 provides a schematic picture of the expression construct pCGN 1 08 1 9. 

Figure 1 8 provides a schematic picture ofthe expression construct pCGNl 0824. 

Figure 19 provides a schematic picture ofthe expression construct pCGN10825. 

Figure 20 provides a schematic picture of the expression construct pCGN 1 0826. 
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Figure 21 provides an amino acid sequence alignment using ClustalW behveen Ihe 
SynecliovysSiLs prenyllransfcrase sequences. 

Figure 22 provides an amino acid sequence oflhe A'rP'F2, ATP^H. ATPT4. ATPT8, and 
ATPTI2 protein sequences from Arahidopsis and the slrl736, slK)926, sill 899, slr0056, and the 
5 sir 1 S 1 8 amino acid sequences from Synechocystis. 

Figure 23 provides the results of the enzymatic assay from preparations of wild type 
SynechocysHs strain 6803, and Synechocystis slrl 736 knockout. 

Figure 24 provides bar graphs of HPLC data obtained from seed extracts of transgenic 
Arahidopsis containing pCGN10822, which provides of the expression of the ATPT2 sequence, 
10 in the sense orientation, from the napin promoter. Provided are graphs for alpha, gamma, and 
delta tocopherols, as well as total tocopherol for 22 transformed lines, as well as a 
nontransformed (wildtype) control. 

Figure 25 provides a bar graph of HPLC analysis of seed extracts from Arahidopsis plants 
transformed with pCGN 10803 (35S-ATPT2, in the anlisense orientation), pCGNI0822 (line 
15 - L625, napin ATPT2 in the sense orientation), pCGNl 0809 (line 1627, 35S-ATPT3 in the sense 
orientation), a nontransformed (wt) control, and an empty vector transformed control. 

Figure 26 shows total tocopherol levels meiasured in Arahidopsis seed of line. 

Figure 27 shows total tocopherol levels measured in T# Arahidopsis seed of line. 

Figure 28 shows total tocopherol levels measured in developing canola seed of line 
20 10822-1. 

Figure 29: shows results of phytyl prenyltransferase activity assay using Synechocystis 

wild type and slrl 737 knockout mutant membrane preparations. 

Figure 30 is the chromatograph from an HPLC analysis of Synechocystis extracts. 
Figure 3 1 is a sequence alignment of the Arahidopsis homologue with the sequence of the 
25 public database. 

Figure 32 shows the results of hydropathic analysis of slrl 737 

Figure 33 shows the results of hydropathic analysis of the Arahidopsis homologue of 

slrl 737. 

Figure 34 shows the catalytic mechanism of various cyclase enzymes 
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Figure 35 is a sequence alignmenl oFslrl 737, slrl 737 ArahUhpsis homologue and the 
Arahidopsis chnlcone isomcrasc. 

DETAILED DESCRIPTION OF THE INVENTION 

5 . • ■ ■ 

The present invention provides, inter alia, compositions and methods for altering (for 
example, increasing and decreasing) the tocopherol levels and/or modulating their ratios in host 
cells. In particular, the present invention provides polynucleotides, polypeptides, and methods of 
use thereof for the modulation of tocopherol content in host plant cells. 

10 The biosynthesis of a-tocopherol in higher plants involves condensation of homogentisic 

acid and phytylpyrophosphate to form 2-methyl-6 phytylbenzoquinol that can, by cyclization and 
subsequent methylations (Fiedler etal., \9i2,Planta, 155:511-515, SolletaL, \9%Q,Arck 
Biochenh Biophys. 204: 544-550, Marshall et al., 1985 Phyiochem,, 24: 1705-171 1, all of which 
are herein incorporated by reference in their entirety), form various tocopherols. 

15 * : . The i4ra6/rfo/7j/j/7d'»r2 mutant identified and characterized by NorriscM/. (1995), is 
deficient in tocopherol and plastiquinone-9 accumulation. Further genetic and biochemical 
analysis suggested that the protein encoded by PDS2 may be responsible for the prenylation of 
homogentisic acid. The PDS2 locus identified by Norris et aL (1995) has been hypothesized to 
possibly encode the tocopherol phytyl-prenyltransferase, as the pds2 mutant fails to accumulate 

20 tocopherols; 

Norris et aL (1995) determined that in Arabidopsis pds2 lies at the top of chromosome 3, 
approximately 7 centimorgans above long hypocotyl2, based on the genetic map. ATPT2 is 
located on chromosome 2 between 36 and 41 centimorgans, lying on BAC F19F24, indicating 
that ATPT2 does not correspond to PDS2. Thus, it is an aspect of the present invention to 
25 provide novel polynucleotides and polypeptides involved in the prenylation of homogentisic acid. 
This reaction may be a rate limiting step in tocopherol biosynthesis, and this gene has yet to be 
isolated. 

U.S. Patent No. 5,432,069 describes the partial purification and characterization of 
tocopherol cyclase from Chlorella protothecoides\ Dunaliella salina and wheat The cyclase 
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described as being glycine rich, waler soluble and wilh a prediclcd M W of 48-50kDa. 1 lowevcr, 
only limilcd peptide fragment sequences were available. 

In one aspect, the present invention provides polynucleotide and polypeptide sequences 
involved in the prcnylalion of straight cliaih and aromatic compounds. Straight chain 
5 prenyltransferases as used herein comprises sequences which encode proteins involved in the 
prenylation of straight chain compounds, including, but not limited to, geranyl geranyl 
pyrophosphate and famesyl pyrophosphate. Aromatic prenyltransferases, as used herein, 
comprises sequences which encode proteins involved in the prenylation of aromatic compounds, 
including, but not limited to, menaquinone, ubiquinone, chlorophyll, and homogentisic acid. The 
10 prenyllransferase of the present invention preferably prenylates homogentisic acid. 

In another aspect, the invention provides polynucleotide and polypeptide sequences to 
tocopherol cyclization enzymes. The 2,3-dimethvl'S-phvtvlplastoquinol cyclase rtocopherol 
cyclase) is responsible for the cyclization of 2,3Hdimethyl-5-phytylplastoquinol to tocopherol. 

15 - Isolated Polynucleotides, Proteins, and Polypeptides 

A first aspect of the present invention relates to isolated prenyltransferase 
polynucleotides. Another aspect of the present invention relates to isolated tocopherol cyclase 
polynucleotides. The polynucleotide sequences of the present invention include isolated 

20 polynucleotides that encode the polypeptides of the invention having a deduced amino acid 
sequence selected from the group of sequences set forth in die Sequence Listing and to other 
polynucleotide sequences closely related to such sequences and variants thereof. 

TTie invention provides a polynucleotide sequence identical over its entire length to each 
coding sequence as set forth in the Sequence Listing. The invention also provides the coding 

25 sequence for the mature polypeptide or a fragment thereof, as well as the coding sequence for the 

mature polypeptide or a fragment thereof in a reading frame with other coding sequences, such as - 
those encoding a leader or secretory sequence, a pre-, pro-, or prepro- protein sequence. The 
polynucleotide can also include non-coding sequences, including for example^ but not limited to, 
non-coding 5' and 3' sequences, such as the transcribed, untranslated sequences, termination 

30 signals, ribosome binding sites, sequences that stabilize mRNA, introns, polyadenylalion signals. 



7 



wo 02/33060 



PCTAJSOl/42673 



and addilional coding sequence that encodes additional amino acids. For example, a marker 
sequence can be included lo facililalc the puridcalion of the fused polypeptide. Polynucleotides 
of the present invention also include polynucleotides comprising a structural gene and the 
naturally associated sequences thai control gene expression. 
5 The invention also includes polynucleotides of the fonnula: 

X-(RiMR2HR3VY 

wherein, at the 5' end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, Ri and R3 are 
any nucleic acid residue, n is an integer belwreen 1 and 3000, preferably between 1 and 1000 and 
R2 is a nucleic acid sequence of the invention, particularly a nucleic acid sequence selected from 

10 the group set forth in the Sequence Listing and preferably those of SEQ ID NOs: 1, 3, 5, 7, 8, 10^ 
1 1, 13-16, 1 8, 23, 29, 36, and 38. In the formula, R2 is oriented so that its 5' end residue is at the 
left, bound to Ri, and its 3' end residue is at the right, bound to R3. Any stretch of nucleic acid 
residues denoted by either R group, where R is greater than I, may be either a hcteropolymer or a 
homopolymer, preferably a hcteropolymer. 

15- ; . The invention also relates to variants of the polynucleotides described herein that encode 
for variants of the polypeptides of the invention. Variants that are fragments of the 
polynucleotides of the invention can be used to synthesize full-length polynucleotides of the 
invention. Preferred embodiments are polynucleotides encoding polypeptide variants wherein 5 
to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid residues of a polypeptide sequence of the invention are 

2 0 substituted, added or deleted, in any combination. Particularly preferred are substitutions, 

additions, and deletions that are silent such that they do not alter the properties or activities of the 
polynucleotide or polypeptide. 

Further preferred embodiments of the invention that are at least 50%, 60%, or 70% 
identical over their entire length to a polynucleotide encoding a polypeptide of the invention, and 

25 polynucleotides that are complementary to such polynucleotides. More preferable are 

polynucleotides that comprise a region that is at least 80% identical over its entire length to a 
polynucleotide encoding a polypeptide of the invention and polynucleotides that arc 
complementary thereto. In this regard, polynucleotides at least 90% identical over their entire 
length are particularly preferred, those at least 95% identical are especially preferred. Further, 
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(hose will, at least 97% idcntily are highly preferred and those wilh al least 98% and 99% identity 
are particularly highly prcfcnvtl, with Ihosc at Icasl 99% being the mosl highly prcrerred. 

Prcfcraxi cmbodimcnls are polynucleotides thai encode polypeptides that retain 
subslantially Ihc same biological funclion or activity as the mature polypeptides encoded by the 
5 polynucleotides set forth in the Sequence Listing. 

The invention further relates to polynucleotides that hybridize to the above-described 
sequences. In particular, the invention relates to polynucleotides that hybridize under stringent 
conditions to the above-described polynucleotides. As used herein, the terms "stringent 
conditions" and •'stringent hybridization conditions" mean that hybridization will generally occur 
10 if there is at least 95% and preferably at least 97% identity between the sequences. An example 
of stringent hybridization conditions is overnight incubation at 42»C in a solution comprising 
50% formamide. 5x SSC (150 mM NaCI, 15 mM trisodium citrate). 50 mM sodium phosphate 
(pH 7.6), 5x Denhardt's solution, 1 0% dextran sulfate, and 20 micrograms/milliliter denatured, 
sheared salmon sperm DNA, followed by washing the hybridization support in 0. Ix SSC at 
15 - approximately eS'C. Other hybridization and wash conditions are well known and are 

exemplified in Sambrook, et al.. Molecular Cloning: A Uboratoiy Manual, Second Edition, cold 
Spring Harbor, NY (1 989), particularly Chapter 11. 

The invention also provides a polynucleotide consisting essentially of a polynucleotide 
sequence obtainable by screening an appropriate library containing the complete gene for a 
20 polynucleotide sequence set for in the Sequence Listing under stringent hybridization conditions 
with a probe having the sequence of said polynucleotide sequence or a fragment thereof, and 
isolating said polynucleotide sequence. Fragments useful for obtaining such a polynucleotide 
include, for example, probes and primers as described herein. 

As discussed herein regarding polynucleotide assays of the invention, for example, 
25 polynucleotides of the invention can be used as a hybridization probe for RNA. cDNA, or 
genomic DNA to isolate full length cDNAs or genomic clones encoding a polypeptide and to 
isolate cDNA or genomic clones of other genes that have a high sequence similarity to a 
polynucleotide set forth in the Sequence Listing. Such probes will generally comprise at least 1 5 
bases. Preferably such probes will have al least 30 bases and can have at least 50 bases. 
30 Particularly preferred probes will have between 30 bases and 50 bases, inclusive. 
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The coding region oFcach gene lhal comprises or is comprised by a polynucleotide 
sequence set forth in the Sequence l^isling may be isolated by screening using a DNA sequence 
provided in the Sequence lasting lo synthesize an oligonucleotide probe. A labeled 
oligonucleotide having a sequence complementary to thai ofa gene of the invention is then used 
5 to screen a library of cDNA, genomic DNA or mllNA lo identify members of the library which 
hybridize lo the probe. For example, synthetic oligonucleotides are prepared which correspond 
to Ihe prenyltransferase or tocopherol cyclase EST sequences. The oligonucleotides are used as ^ 
primers in polymerase chain reaction (PCR) techniques to obtain 5* and 3' terminal sequence of 
. prenyltransferase or tocopherol cyclase genes. Alternatively, where oligonucleotides of low 

10 degeneracy can he prepared from particular prenyltransferase or tocopherol cyclase peptides, 
such probes may be used directly to screen gene libraries for prenyltransferase or tocopherol 
cyclase gene sequences. In particular, screening of cDNA libraries in phage vectors is useful in 
such methods due to lower levels of background hybridization. 

Typically, a prenyltransferase or tocopherol cyclase sequence obtainable firom the use of 

1 5 - nucleic acid probes will show 60-70% sequence identity between the target prenyltransferase or 
tocopherol cyclase sequence and tiie encoding sequence used as a probe. However, lengthy 
sequences with as little as 50-60% sequence identity may also be obtained. The nucleic acid 
probes may be a lengthy fragment of the nucleic acid sequence, or may also be a shorter, 
oligonucleotide probe. When longer nucleic acid fi*agments are employed as probes (greater than 
^ 20 about 100 bp), one may screen at lower stringencies in order to obtain sequences from the target 
sample which have 20-50% deviation (i.e., 50-80% sequence homology) from the sequences 
used as probe. Oligonucleotide probes can be considerably shorter than the entire nucleic acid 
sequence encoding an prenyltransferase or tocopherol cyclase' enzyme, but should be at least 
about 10, preferably at least about 15, and more preferably at least about 20 nucleotides. A 

25 higher degree of sequence identity is desired when shorter regions are . used as opposed to longer 
regions. Jt may thus be desirable to identify regions of highly conserved amino acid sequence to 
design oligonucleotide probes for delecting and recovering other related prenyltransferase or 
tocopherol cyclase genes. Shorter probes are often particularly useful for polymerase chain 
reactions (PCR), especially when highly conserved sequences can be identified. (Sec, Gould, et 

30 aL,PNASUSA(\m)86:mA-\93i.l 
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Anolher aspccl ofllie present invenlion relates lo prcnyllmiisrcrase or tocopherol cyclase 
polypeptides. Such polypeptides incUide isolated polypeptides set forth in the Sequence Listing, 
as well as polypeptides and fragmenls thereof, particularly those polypeptides which exhibit 
prenyltransfcrasc or tocopherol cyclase activity and also those polypeptides which have at least 
5 . 50%. 60% or 70% identity, preferably at least 80% identity, more preferably at least 90% 

identity, and most preferably at least 95% identity to a polypeptide sequence selected from the 
group of sequences set forth in the Sequence Listing, and also include portions of such 
polypeptides, wherein such portion of the polypeptide preferably includes at least 30 amino acids 
and more preferably includes at least 50 amino acids. 

10 "Identity", as is well understood in the art, is a relationship between two or more 

polypeptide sequences or two or more polynucleotide sequences, as determined by comparing the 
sequences. In the art, "identity" also means the degree of sequence relatedness between 
polypjeptide or polynucleotide sequences, as determined by the match between strings of such 
sequences. "Identity" can be readily calculated by known methods including, but not limited to, 

1 5 - those described in Computational Molecular Biology, Lesk, A.M., ed., Oxford University Press, 
New York (1988); Biocomputing: Informatics and Genome Projects^ Smith, D.W,, ed.. 
Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part /, GrifSn, A.M. 
and Griffin, H.G., eds., Humana Press, New Jersey (1 994); Sequence Analysis in Molecular 
Biology, von Heinje, G., Academic Press (1987); Sequence Analysis Primer, Gribskov, M. and 

20 Devereux, J., eds., Stockton Press, New York (1991); and Carillo, H., and Lipman, D., SIAM J 
Applied Math, 48:1073 (1988). Methods to determine identity are designed to give the largest . 
match between the sequences tested. Moreover, methods to determine identity are codified in 
publicly available programs. Computer programs which can be: used to determine identity 
between two sequences include, but are not limited to, GCG (Devereux, J,, et al., Nucleic Acids 

25 Research 12(1):387 (1984); suite of five BLAST programs, three designed for nucleotide 

sequences queries (BLASTN, BLASTX, and TBLASTX) and two designed for protein sequence 
queries (BLASTP and TBLASTN) (Coulson, Trends in Biotechnology, 12: 76-80 (1994); Birren, 
et aL Genome Analysis, 1: 543-559 (1997)). The BLAST X program is publicly available from 
NCBI and other sources {BLAST Manual Altschul, S., et al, NCBI NLM NIH, Belhesda, MD 



11 



wo 02/33060 



PCTAJSOl/42673 



20894; AUschul, S., cV aL J. Moi BinL 2 1 5:403-4 10(1 990)). Hic well known Smith Walcmian 

algorilhm can also be used lo dctenninc identity. 

Parameters for polypeptide scqiienec comparison typically include the following: 

Algorithm: Needleman and Wunsch, ./. Mol. Biol. 48:443-453 ( 1 970) 

Comparison matrix: BL0SSUM62 From HentikofTand HentikofF, Proc, Nati Acad xSci 

C/&4 89:10915-10919(1992) 
Gap Penalty: 12 
Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as.the "gap" 
program from Genetics Computer Group, Madison Wisconsin. The above parameters along with 
no penally for end gap are the default parameters for peptide comparisons. 

Parameters for polynucleotide sequence comparison include the following: 

Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970) 

Comparison matrix: matches = +] 0; mismatches = 0 

Gap Penalty: 50 

Gap Length Penalty: 3 

A program which can be used with these parameters is publicly available as the "gap" 
program from Genetics Computer Group, Madison Wisconsin. The above parameters are the 
default parameters for nucleic acid comparisons. 

The invention also includes polypeptides of the formula: 

X-(Ri)n-(R2KR3)n-Y 

wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus, Y is hydrogen or a 
metal, R| and R3 are any amino acid residue, n is an integer between 1 and 1000, and R2 is an 
amino acid sequence of the invention, particulariy an amino acid sequence selected from the 
group set forth in the Sequence Listing and preferably those encoded by the sequences provided 
inSEQ lDNOs:2,4,6, 9, 12, 17, 19-22, 24-28, 30, 32-35, 37, and 39, In the formula, R2 is 
oriented so that its amino terminal residue is at the left, bound to Ri, and its carboxy terminal 
residue is at the right, bound to R3. Any stretch of amino acid residues denoted by either R 
group, where R is greater than 1, may be either a heteropolymer or a homopolymer, preferably a 
heleropolymer. 
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Polypeplidcs of the presenl invention include isolatcti polypeptides encoded by a 
polynucleotide comprising a sequence selected IW>m the group ofa sequence contained in Ihc 
Sequence Listing set forth herein . 

The polypeptides of the present invention can be mature protein or can t>c pari ofa fusion 

5 protein. 

Fragments and variants of the polypeptides are also considered to be a part of the 
invention. A fragment is a variant polypeptide which has ah amino acid sequence that is entirely 
the same as part but not all of the amino acid sequence of the previously described polypeptides. 
The fragments can be "free-standing" or comprised within a larger polypeptide of which the 
10 fragment forms a part or a region, most preferably as a single continuous region. Preferred 

fragments are biologically active fragments which are those fragments that mediate activities of 
the polypeptides of the invention, including those with similar activity or improved activity or 
with a decreased activity. Also included are those fragments that antigenic or immunogenic in an 
animal, particularly a human. 

15 . Variants ofthe polypeptide also include polypeptides that vary fit)m the sequences set 

forth in the Sequence Listing by conservative amino acid substitutions, substitution ofa residue 
by another with like characteristics. In general, such substitutions are among Ala, Val, Leu and 
He; between Ser and Thr; between Asp and Glu; between Asn and Gin; between Lys and Arg; or 
between Phe and Tyr. Particularly preferred are variants in which 5 to 10; 1 to 5; 1 to 3 or one 

2 0 amino acid(s) are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to produce 
the corresponding full length polypeptide by peptide synthesis. Therefore, these variants can be 
used as intermediates for producing the full-length polypeptides of the invention. 

The polynucleotides and polypeptides of the invention can be used, for example, in the 

25 transformation of host cells, such as plant host cells, as further discussed herein. 

The invention also provides polynucleotides that encode a polypeptide that is a mature 
protein plus additional amino or carboxyl-terminal amino acids, or amino acids within the mature 
polypeptide (for example, when the mature form of the protein has more than one polypeptide 
chain). Such sequences can, for example, play a role in the processing ofa protein from a 

30 precursor to a mature form, allow protein transport, shorten or lengthen protein half-life, or 
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facilitate manipulation ofthe protein in assays or production. It is contemplated that cellular 
enzymes can be used to remove any additional amino acids 1mm the mature protein. 

A precursor protein, having the mature form ofthe polypeptide fused to one or more 
prosequences may be an inactive fonn ofthe polypeptide. 'ITie inactive precursors generally arc 
activated when the prosequences are removed. Some or all ofthe prosequences may be removed 
prior to activation. Such precursor protein are generally called proproteins. 

Plant Conshiicts and Methods ofUse 

Of particular interest is the use ofthe nucleotide sequences in recombinant DNA 
constructs to direct the transcription or transcription and translation (expression) ofthe 
prenyltransferase or tocopherol cyclase sequences ofthe present invention in a host plant cell. 
The expression constructs generally comprise a promoter functional in a host plant cell operably 
linked to a nucleic acid sequence encoding a prenyltransferase or tocopherol cyclase ofthe 
• present invention and a transcriptional termination region functional in a host plant cell. 

A first nucleic acid sequence is "operably linked" or "operably associated" with a second 
nucleic acid sequence when the sequences are so arranged that the first nucleic acid sequence 
affects the function ofthe second nucleic-acid sequence. Preferably, the two sequences are part 
of a single contiguous nucleic acid molecule and more preferably are adjacent For example, a 
promoter is operably linked to a gene if the promoter regulates or mediates transcription of the 
gene in a cell. 

Those skilled in the art v^ll recognize that there are a number of promoters which are 
functional in plant cells, and have been described in the literature. Chloroplast and plastid 
specific promoters, chloroplast or plastid functional promoters, and chloroplast or plastid 
operable promoters are also envisioned. 

One set of plant functional promoters are constitutive promoters such as the CaMV35S or 
FMV35S promoters that yield high levels of expression in most plant organs. Enhanced or 
duplicated versions ofthe CaMV35S and FMV35S promoters are useful in the practice of this 
invention (Odell, et al (1985) iVa/wre 313:8 10-812; Rogers, U.S. Patent Number 5,378, 619). In 
addition, it may also be preferred to bring about expression ofthe prenyltransferase or tocopherol 
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cyclase gene in specific lissues oflhe plant, such as leaf;slem, root, luber, seed, fniiU elc, and 
Ihc promoter chosen should have the desired tissue and developmental specificity. 

Of particular interest is the expression of the nucleic acid sequences of the present 
invention from transcription initiation regions wliich are prelercnlially expressed in a plant seed 
5 tissue. Examples ofsuch seed preferential transcription initiation sequences include those 

sequences derived from sequences encoding plant storage protein genes or from genes involved 
in fatty acid biosynthesis in oilseeds. Examples ofsuch promoters include the 5* regulatory 
regions from such genes as napin (Kridl at ql, SeedScl Res, 7:209:219 (1 991)), phaseolin, zein, 
soybean trypsin inhibitor, ACP, stearoyl-ACP desaturase, soybean a' subunit of p-conglyoinin 

10 (soy 7s, (Chen ei ai , Proc. Nati Acad, ScL, 83 :8560-8564 (1 986))) and oleosin. 

It may be advantageous to direct the localization of proteins conferring prenyltransferase 
or tocopherol cyclase to a particular subcellular compartment, for example, to the mitochondrion, 
endoplasmic reticulum, vacuoles, chloroplast or other plastidic compartment. For example, 
where the genes of interest oflhe present invention v^ill be targeted to plastids, such as 

15 ^ chloroplasts, for expression, the constructs will also employ the use of sequences to direct the 

gene to the plastid. Such sequences are referred to herein as chloroplast transit peptides (CTP) or 
plastid transit peptides (PTP), In this manner, where the gene of interest is not directly inserted 
into the plastid, the expression construct will additionally contain a gene encoding a transit 
peptide to direct the gene of interest to the plastid. The chloroplast transit peptides may be 

2 0 derived from the gene of interest, or may be derived from a heterologous sequence having a CTP. 
Such transit peptides are known in the art. See, for example, Von Heijne et aL (1991) Plant MoL 
BioL Rep. P: 104-126; Clarke/ al (1989)7. Biol Chem. 2tf¥: 17544-1 7550; della-Cioppa a/. 
{\m) Plant PhysioL 5^:965-968; Romer et aL (1993) Biochem. Biophys, Res Commun, 
756:1414-1421; and. Shah et al, (1986) Science 235:478-481. 

25 Depending upon the intended use, the constructs may contain the nucleic acid sequence 

which encodes the entire prenyltransferase or tocopherol cyclase protein, or a portion thereof 
For example, where antiscnse inhibition of a given prenyltransferase or tocopherol cyclase 
protein is desired, the entire prenyltransferase or tocopherol cyclase sequence is not required. 
Furthemiore, where prenyltransferase or tocopherol cyclase sequences used in constmcts ane 

30 intended for use as probes, it may be advantageous to prepare constructs containing only a 
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particular portion era prcnyltransferasc or tocopherol cyclose encoding sequence, for example a 
sequence which is discovered to encode a highly conserved prcnyllransferasc or tocopherol 
cyclase region. 

The skilled artisan will recognize that there are various methods Tor the inhibition of 
5 expression of endogenous sequences in a host cell. Such methods include, but arc not limited to, 
anlisense suppression (Smith, et ai ( 1 988) Nature 334:724-726) , co-suppression (Napoli, et aL 
(1989) Plant Cell 2:279-289), ribozymes (PCT Publication WO 97/10328), and combinations of., 
sense and antisense Waterhouse, et aL (1998) Proc, Natl. Acad ScL USA 95:13959-13964. 
Methods for the suppression of endogenous sequences in a host cell typically employ the 
10 transcription or transcription and translation of at least a portion of the sequence to be 

suppressed. Such sequences may be homologous to coding as well as non-coding regions of the 
endogenous sequence. 

Regulatory transcript termination regions may be provided in plant expression constructs 
of this invention as well. Transcript termination regions may be provided by the DNA sequence 

1 5 • ejicoding the prenyltransferase or tocopherol cyclase or a convenient transcription termination 
region derived from a different gene source, for example, the transcript termination region which 
is naturally associated with the transcript initiation region. The skilled artisan will recognize that 
any convenient transcript tennination region which is capable of terminating transcription in a 
plant cell may be employed in the constructs of the present invention. 

2 0 Alternatively, constructs may be prepared to direct the expression of the prenyltransferase 

or tocopherol cyclase sequences directly from the host plant cell plastid. Such constructs and 
methods are known in the art and are generally described, for example, in Svab, aL (1990) 
Proc, NatL Acad. ScL USA 87:8526-8530 and Svab and Maliga (1993) Proc. NatL Acad ScL 
USA 90:913-917 and in U.S. Patent Number 5,693,507. 

25 The prenyltransferase or tocopherol cyclase constructs of the present invention can be 

used in transformation methods with additional constructs providing for the expression of other • 
nucleic acid sequences encoding proteins involved in the production of tocopherols, or 
tocopherol precursors such as homogentisic acid and/or phytylpyrophosphate. Nucleic acid 
sequences encoding proteins involved in the production of homogentisic acid are known in the 

30 art, and include but not are limited to, 4-hydroxyphenylpyruvate dioxygenase (HPPD, EC 
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1.13.1 1.27) described for example, by Garcia, at aL {{1999) Plani PhysioL 1 19(4):1507-1516), 
mono or birunclioiiai lyrA (described For example by Xia, ci aL (1992) Gen Microbiol 
138:1309-1316, and Hudson, c/ al (1984) J. MoL BioL 180:1023-1051), Oxygenase, 4- 
hydroxyphenylpyruvale di- (9CI), 4-Hydroxyphenylpyruvalc dioxygenase; p- 
5 Hydroxyphenylpyruvalc dioxygenase; p-Hydroxyphenylpyruvate hydroxylase; p- 
Hydroxyphenylpyruvate oxidase; p-Hydroxyphenylpyruvic acid hydroxylase; p- 
Hydroxyphenylpyruvic hydroxylase; p-Hydroxyphenylpyruvic oxidase), 4-hydroxyphenylacetate, 
NAD(P)H:oxygen oxidoreduclase (1-hydroxylating); 4-hydroxyphenyIacetate 1-monooxygenase, 
and the like. In addition, constructs for the expression of nucleic acid sequences encoding 

10 proteins involved in the production of phytyl pyrophosphate can also be employed with the 
prenyltransferase or tocopherol cyclase constructs of the present invention. Nucleic acid 
sequences encoding proteins involved in the production of phytylpyrophosphate are known in the 
art, and include, but are not limited to geranylgeranylpyrophosphate synthase (GGPPS), 
geranyigeranylpyrophosphate reductase (GGH), l-deoxyxylulose-5-phosphate synthase, 1- 

15 - deoxy-D-xylolose-5-phosphate reductoisomerase, 4-diphosphocytidyl-2-C-methylerythritol 
synthase, isopentyl pyrophosphate isomerase. 

The prenyltransferase or tocopherol cyclase sequences of the present invention find use in 
the preparation of transformation constructs having a second expression cassette for the 
expression of additional sequences involved in tocopherol biosynthesis. Additional tocopherol 

20 biosynthesis sequences of interest in the present invention include, but are not limited to ganima- 
tocpherol methyltransferase (Shintani, e/ al, (1998) Science 282(5396):2098-2 100), tocopherol 
cyclase, and tocopherol methyltransferase, 

A plant cell, tissue, organ, or plant into which the recombinant DNA constructs 
containing the expression constructs have been introduced is considered transformed, transfected, 

25 or transgenic. A transgenic or transformed cell or plant also includes progeny of the cell or plant 
and progeny produced from a breeding program employing such a transgenic plant as a parent in 
a cross and exhibiting an altered phenotype resulting from the presence of a prenyltransferase or 
tocopherol cyclase nucleic acid sequence. 

Plant expression or transcription constructs having a prenyltransferase or tocopherol 

3 0 cyclase as the DNA sequence of interest for increased or decreased expression thereof may be 
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employed wilh a wide variely orplanl liTc* particularly, plant life involved in the production of 
vegelabic oils for edible and industrial uses. Particularly preferred plants for use in the methods 
of the present invention include, but are not limited to: Acacia, alfalfa, anelh, apple, apricot 
artichoke, arugula, asparagus, avocado, banana, barley, beans, beet, blackberry, blueberry, 
5 broccoli, brussels sprouts, cabbage, canola, cantaloupe, carrot, cassava, cauliflower, celery, 
cherry, chicory, cilantro, citrus, Clementines, coffee, com, cotton, cucumber, Douglas fir, 
eggplant, endive, escarole, eucalyptus, fennel, figs, garlic, gourd, grape, grapefruit, honey dew, 
jicama, kiwifruil, lettuce, leeks, lemon, lime. Loblolly pine, mango, melon, mushroom, nectarine, 
nut, oat, oil palm, oil seed rape, okra, onion, orange, an ornamental plant, papaya, parsley, pea, 
10 peach, peanut, pear, pepper, persimmon, pine, pineapple, plantain, plum, pomegranate, poplar, 
potato, pumpkin, quince, radiata pine, radicchio, radish, raspberry, rice, rye, sorghum. Southern 
pine, soybean, spinach, squash, strawberry, sugarbeet, sugarcane, sunflower, sweet potato, 
sweetgum, tangerine, tea, tobacco, tomato, triticale, turf, turnip, a vine, watermelon, wheat, yams, 
and zucchini. 

15. ... Most especially preferred are temperate oilseed crops. Temperate oilseed crops of 
interest include, but are not limited to, rapeseed (Canola and High Erucic Acid varieties), 
sunflower, safflower, cotton, soybeian, peanut, coconut and oil palms, and com. Depending on 
the method for introducing the recombinant constructs into the host cell, other DN A sequences 
may be required. Importantly, this invention is applicable to dicotyledyons and monocotyledons 

2 0 species alike and will be readily applicable to new and/or improved transformation and 
regulation techniques. 

Of particular interest, is the use of prenyltransferase or tocopherol cyclase constructs in 
plants to produce plants or plant parts, including, but not limited to leaves, stems, roots, 
reproductive, and seed, with a modified content of tocopherols in plant parts having transformed 

25 plant cells. 

For immunological screening, antibodies to the protein can be prepared by injecting 
rabbits or mice with the purified protein or portion thereof, such methods of preparing antibodies 
being well known to those in the art. Either monoclonal or polyclonal antibodies can be 
produced, although typically polyclonal antibodies are more usefijl for gene isolation. Western 
30 analysis may be conducted to determine that a related protein is present in a crude extract of the 
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desired plant species, as delemiincd by cross-reaction willi the antibodies to the encoded 
proteins. When cross-reaclivity is observed, genes encoding the related proteins are isolated by 
screening expression libraries representing the desired plant species, lixpression libraries can be 
constructed in a variety orcommercially available vectors, including lambda gtl l, as described in 
5 Sambrook, e/ «/. {Molecular Cloning: A Uihoratory Manual, Second Edition (1 989) Cold Spring 
Harbor Laboratory, Cold Spring Harbor, New York). 

To confirm the activity and specificity of the proteins encoded by the identified nucleic 
acid sequences as prenyltransferase or tocopherol cyclase enzymes, in viiro assays are performed 
in insect cell cultures using baculovirus expression systems. Such baculovirus expression 

10 systems are known in the art and are described by Lee, ei al U.S. Patent Number 5,348,886, the 
entirety of which is herein incorporated by reference. 

In addition, other expression constructs may be prepared to assay for protein activity 
utilizing different expression systems. Such expression constructs are transformed into yeast or 
prokaryotic host and assayed for prenyltransferase or tocopherol cyclase activity. Such 

15 - expression systems are known in the art and are readily available through commercial sources. 
In addition to the sequences described in the present invention, DNA coding sequences 
useful in the present invention can be derived from algae, fungi, bacteria, mammalian sources, 
plants, etc. Homology searches in existing databases using signature sequences corresponding 
to conserved nucleotide and amino acid sequences of prenyltransferase or tocopherol cyclase can 

20 be employed to isolate equivalent, related genes from other sources such as plants and 

microorganisms. Searches in EST databases can also be employed. Furthermore, the use of 
DNA sequences encoding enzymes functionally enzymatically equivalent to those disclosed 
herein, wherein such DNA sequences are degenerate equivalents of the nucleic acid sequences 
disclosed herein in accordance with the degeneracy of the genetic code, is also encompassed by 

25 the present invention. Demonstration of the functionality of coding sequences identified by any 
of these methods can be carried out by complementation of mutants of appropriate organisms, 
such as Synechocysiis, Shewanella, yeasU Pseudomonas, Rhodobacteria, etc., that lack specific 
biochemical reactions, or that have been mutated. The sequences of the DNA coding regions 
can be optimized by gene resynthesis, based on codon usage, for maximum expression in 

30 particular hosts. 
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For the alleaUion of tocopherol production in a host cell, a second expression construct 
can be used in accordance with the present invention. I'or example, the prenyl transferase or. 
tocopherol cyclase expression construct can be introduced into a host cell in conjunction with a 
second expression construct having a nucleotide sequence for a protein involved in tocopherol 
5 biosynthesis. 

The method of transformation in obtaining such transgenic plants is not critical to the 
instant invention, and various methods of plant transformation are currently available. 
Furthermore, as newer methods become available to transform crops, they may also be directly 
applied hereunder. For example, many plant species naturally susceptible to Agrobacierium 

10 infection may be successfully transformed via tripartite or binary vector methods of 

Agrobacierium mediated transformation. In many instances, it will be desirable to have the . 
construct bordered on one or both sides by T-DNA, particularly having the left and right borders, 
more particularly the right border. This is particularly useful when the construct uses A, 
tumefaciens ovA. rhizogenes as a mode for transformation, although the T-DNA borders may 

15 - find use with other modes of transformation. In addition, techniques of microinjection, DNA 
particle bombardment, and electroporation have been developed which allow for the 
transformation of various monocot and dicot plant species. 

Normally, included with the DNA construct will be a structural gene having the necessary 
regulatory regions for expression in a host and providing for selection of transfomiant cells. The 

2 0 gene may provide for resistance to a cytotoxic agent, e.g. antibiotic, heavy metal, toxin, etc., 
complementation providing prototrophy to an auxotrophic host, viral immunity or the like. 
Depending upon the number of different host species the expression construct or components 
thereof are introduced, one or more markers may be employed, where different conditions for 
selection are used for the different hosts. 

2 5 Where Agrobacierium is used for plant cell transformation, a vector may be used w^hich 

may be introduced into [he Agrobacierium host for homologous recombination with T-DNA or 
the Ti- or Ri-plasmid present in the Agrobacierium host. The Ti- or Rj-plasniid containing the T- 
DNA for recombination may be armed (capable of causing gall formation) or disanried 
(incapable of causing gall formation), the latter being permissible, so long as the v/r genes are 
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present in Ihe Iransfonncd Af^rohactenum hosl. The armed plasmid can give a mixture of normal 
plant cells and gall. 

In some instances where A^wbacluriwu is used as the vehicle for tmnsforming host plant 
cells, the expression or transcription construct bordered by the T-DNA border region(s) will be 
5 inserted into a broad host range vector capable of replication in £ coli and Agrohacierium, there 
being broad host range vectors described in the literature. Commonly used is pRK2 or 
derivatives thereof. See, for example, Dilla, ei qL, (Prac, Nat. Acad ScL U&A. (1980) 
77:7347-7351) and EPA 0 120 515, which are incorporated herein by reference. Alternatively, 
one may insert the sequences to be expressed in plant cells into a vector containing separate 
10 replication sequences, one of which stabilizes the vector in £ coli, and the other in 

Agrobacterium. See, for example, McBride, et al {Plant MoL BioL (1990) 7^:269-276), wherein 
the pRjHRJ (Jouanin, et aL MoL Gen. Genet. (1985) 201:370-374) origin of replication is 
utilized and provides for added stability of the plant expression vectors in hos\ Agrobacterium 
cells. 

15' Included with the expression construct and the T-DNA will be one or more markers, 

which allow for selection of transformed Agrobacterium and transformed plant cells, A 
number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G41 8, hygromycin, or the like. The particular 
marker employed is not essential to this invention, one or another marker being preferred 

2 0 depending on the particular host and the manner of construction. 

For transformation of plant cells using Agrobacterium, explanls may be combined and 
incubated with the transformed Agrobacterium for sufficient time for transformation, the bacteria 
killed, and the plant cells cultured in an appropriate selective medium. Once callus forms, shoot 
formation can be encouraged by employing the appropriate plant hormones in accordance with 

25 knovm methods and the shoots transferred to rooting medium for regeneration of plants. The 
plants may then be grown to seed and the seed used to establish repetitive generations and for 
isolation of vegetable oils. 

There are several possible ways to obtain the plant cells of this invention which contain 
multiple expression constructs. Any means for producing a plant comprising a construct having 

30 a DNA sequence encoding the expression construct of the present invention, and at least one 
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other construct having another DNA sequence encoding an enzyme arc encompassed by the 
present invention. I'or example, the expression construct can be used to IransfonTi a plant at the 
same time as the second constnict cither by inclusion of both expression constructs in a single 
transformation vector or by using separate vectors, each of which express desired genes. The 
5 second construct can he introduced into a plant which has already been transformed with Ihe 
prenyltransferase or tocopherol cyclase expression construct, or alternatively, transformed plants, 
one expressing the prenyltransferase or tocopherol cyclase construct and one expressing the 
second construct, can be crossed to bring the constructs together in the same plant 

Transgenic plants of the present invention may be produced from tissue culture, and 

10 subsequent generations grown from seed. Alternatively, transgenic plants may be grown using 
apomixis. Apomixis is a genetically controlled method of reproduction in plants where the 
embryo is formed without union of an egg and a sperm. There are three basic types of apomictic 
reproduction: 1) apospory where the embryo develops from a chromosomally unreduced egg in 
an embryo sac derived from the nucleus, 2) diplospory where the embryo develops from an 

15 - unreduced egg in an embryo sac derived from the megaspore mother cell, and 3) adventitious 
embryony where the embryo develops directly from a somatic cell. In most fomis of apomixis, 
pseudogamy or fertilization of the polar nuclei to produce endosperm is necessary for seed 
viability. In apospory, a nurse cultivar can be used as a pollen source for endosperm formation in 
seeds. The nurse cultivar does not affect the genetics of the aposporous apomictio cultivar since 

20 the unreduced egg of the cultivar develops parthenogenetically, but makes possible endosperm 
production. Apomixis is economically important, especially in transgenic plants, because it 
causes any genotype, no matter how heterozygous, to breed true. Thus,with apomictic 
reproduction, heterozygous transgenic plants can maintain their genetic fidelity throughout 
repeated life cycles. Methods for the production of apomictic plants are known in the art See, 

25 U.S. Patent No.5,81 1,636, which is herein incorporated by reference in its entirety. 

The nucleic acid sequences of the present invention can be used in constructs to provide 
for the expression of the sequence in a variety of host cells, tjoth prokaryotic eukaiyotic. Host 
cells of the present invention preferably include monocotyledenous and dicotyledenous plant 
cells. 
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In general, the skilled artis<in is familiar with Ihe standard resource materials which 
describe specific conditions and procedures for the construction, manipulation and isolation of 
macromolecules (e.g., DNA molecules, plasmids, etc.)- generation of recombinant organisms and 
the screening and isolating of clones, (see for example, Sambrook ei oL Molecular Cloning: A 
5 Laboratory Manual, Cold Spring Harbor Press ( 1 989); Maliga el aL Methods in Plant 

Molecular Biology, Cold Spring Harbor Press (1995), the entirety of which is herein incorporated 
by reference; Birren et ai, Genome Analysis: Analyzing DNA, 1 , Cold Spring Harbor, New York, 
the entirety of which is herein incorporated by reference). 

Methods for the expression of sequences in insect host cells are known in the art. 

10 Baculovirus expression vectors are recombinant insect viruses in which the coding sequence for a 
chosen foreign gene has been inserted behind a baculovirus promoter in place of the viral gene, 
e.g., polyhedrin (Smith and Summers, U.S. Pat. No., 4,745,051, the entirety of which is 
incorporated herein by reference). Baculovirus expression vectors are known in the art, and are 
described for example in Doerfler, Cum Top. Microbiol Immunol 757:51-68 (1968); Luckow 

15 • and Summers, Bio/Technology 6'Al -55 (1988a); MxWqv, Annual Reviews of Microbiol ^2:177-199 
(1988); Summers, Curr. Comm. Molecular Biology^ Cold Spring Harbor Press, Cold Spring 
Harbor, N.Y. (1988); Summers and Smith, A Manual of Methods for Baculovirus Vectors and 
Insect Cell Culture Procedures, Texas Ag. Exper. Station Bulletin No. 1555 (1 988), the 
entireties of which is herein incorporated by reference) 

2 0 Methods for the expression of a nucleic acid sequence of interest in a fungal host cell are 

known in the art. The fungal host cell may, for example, be a yeast cell or a filamentous fungal 
cell. Methods for the expression of DNA sequences of interest in yeast cells are generally 
described in "Guide to yeast genetics and tnolecular biology", Guthrie and Fink, eds. Methods in 
enzymology , Academic Press, Inc. Vol 194 (1991) and Gene expression technology*', Goeddel 

2 5 ed. Methods in Enzymology, Academic Press, Inc., Vol 1 85 (1 99 1). 

Mammalian cell lines available as hosts for expression arc known in the art and include 
many immortalized cell lines available from the American Type Culture Collection (ATCC, 
Manassas, VA), such as HeLa cells, Chinese hamster ovary (CHO) cells, baby hamster kidney 
(BHK.) cells and a number of other cell lines. Suitable promoters for mammalian cells are also 

30 known in the art and include, but are not limited to, viral promoters such as that from Simian 
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Vims 40 {SV40) (Fiers et oL. Nature 273:1 13 (1978), the enlircly of which is herein incorporated 
by rcrcrcncc). Rous sarcoma virus (RSV), adenovirus ( ADV) and bovine papilloma virus (BPV). 
Mammalian cells may also require lerminalor sequences and poly-A addition sequences: 
Enhancer sequences which increase expression may also be included and sequences which 
promote amplification of the gene may also be desirable (for example methotrexate resistance 
genes). 

Vectors suitable for replication in mammalian cells are well known in the art, and may 
include viral replicons, or sequences which insure integration of the appropriate sequences 
encoding epitopes into the host genome. Plasmid vectors that greatly facilitate the construction of 
recombinant viruses have been described (see^ for example, Mackett et al^ J Virol 49:%S1 
(1984); Chakrabarli et al, MoL Cell Biol 5:3403 (1985); Moss, In: Gene Transfer Vectors For 
Mammalian Cells (Miller and Calois, eds., Cold Spring Harbor Laboratory, N.Y., p, 10, (1987); 
all of which are herein incorporated by reference in their entirely). 

The invention also includes plants and plant parts, such as seed, oil and meal derived 
from seed, and feed and food products processed from plants, which are enriched in tocopherols. 
Of particular interest is seed oil obtained from transgenic plants where the tocopherol level has 
been increased as compared to seed oil of a non-transgenic plant. 

The harvested plant material may be subjected to additional processing to further enrich 
the tocopherol content. The skilled artisan will recognize that there are many such processes or 
methods for refining, bleaching and degumming oil. United States Patent Number 5,932,261, 
issued August 3, 1999, discloses on such process, for the production of a natural carotene rich 
refined and deodorised oil by subjecting the oil to a pressure of less than 0.060 mbar and to a 
temperature of less than 200.degree. C. Oil distilled by this process has reduced free fatty acids, 
yielding a refined, deodorised oil where Vitamin E contained in the feed oil is substantially 
retained in the processed oil. The teachings of this patent are incorporated herein by reference. 

The invention now being generally described, it will be more readily understood by 
reference to the following examples which are included for purposes of illustration only and are 
not intended to limit the present invention. 
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EXAMPLES 

Exnmplc I: Idcnlificulion ofPrenyllransrcmsc or tocopherol cyclase Sequences 

5 PSI-BLAST (Allschul, et al. (1 997) Nuc Acid Res 25:3389-3402) profiles were generated 

for both the straight chain and aromatic classes of prenyltransferases. To generate tlie straight 
chain profile, a prenyl- transferase from Porphyra purpurea (Genbank accession 1709766) was 
used as a query against the NCBI non-redundant protein database. The E. coli enzyme involved 
in the formation of ubiquinone, ubiA (genbank accession 1790473) was used as a starting 

10 sequence to generate the aromatic prenyltransferase profile. These profiles were used to search 
public and proprietary DNA and protein data bases. In Arabidopsis six putative 
prenyltransferases of the straight-chain class were identified, ATPTl, (SEQ ID N0:9), ATPT7 
(SEq ID NO: 1 0), ATPT8 (SEQ ID NO: 1 1), ATPT9 (SEQ ID NO: 1 3), ATPTl 0 (SEQ ID 
NO: 14), and ATPTl 1 (SEQ ID NO; 15), and six were identified of the aromatic class, ATPT2 

15 . (SEQ ID N0:1), ATPT3 (SEQ ID N0:3), ATPT4 (SEQ ID N0:5), ATPT5 (SEQ ID N0:7), 

ATPT6 (SEQ ID NO:8), and ATPT12 (SEQ ID N0:16). Additional prenyltransferase sequences 
from other plants related to the aromatic class of prenyltransferases, such as soy (SEQ ID NOs: 
19-23, the deduced amino acid sequence of SEQ ID NO:23 is provided in SEQ ID NO:24) and 
maize (SEQ ID NOs:25-29, and 31) are also identified. The deduced amino acid sequence of 

20 ZMPT5 (SEQ ID NO:29) is provided in SEQ ID NO:30. 

Searches are performed on a Silicon Graphics Unix computer using additional 
Bioaccellerator hardware and Gen Web software supplied by Compugen Ltd. This software and 
hardware enables the use of the Smith- Waterman algorithm in searching DNA and protein 
databases using profiles as queries. The program used to query protein databases is profilesearch. 

25 This is a search where the query is not a single sequence but a profile based on a multiple 

alignment of amino acid or nucleic acid sequences. The profile is used to queiy a sequence data 
set, i.e., a sequence database. The profile contains all the pertinent information for scoring each 
position in a sequence, in effect replacing the "scoring matrix" used for the standard query 
searches. The program used to query nucleotide databases with a protein profile is tprofilesearch. 

30 Tprofilesearch searches nucleic acid databases using an amino acid profile query. As the search is 
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ainning, sequences in the dalnbase are translated to amino acid sequences in six reading frames. 
Tlie output file for tprofilescarch is identical to the output file For profilesearch except for an 
additional column that indicates the frame in which the best alignment occurred. 

The Smith- Waterman algorillim, (Smith and Watennan ( 1 98 1 ) supra), is used to search 
5 for similarities between one sequence from tlic query and a group of sequences contained in the 
database. E score values as well as other sequence informatiot\, such as conserved peptide 
sequences are used to identify related sequences. 

To obtain the entire coding region corresponding to the Arobidopsis prenyltransferase 
sequences, synthetic oligo-nucleotide primers are designed to amplify the 5' and 3' ends of 
10 partial cDNA clones containing prenyltransferase sequences. Primers are designed according to 
the respective Arabidopsis prenyltransferase sequences and used in Rapid Amplification of 
cDNA Ends (RACE) reactions (Frohman et ai (1988) Proc Nail, Acad ScL a&4 85:8998-9002) 
using the Marathon cDNA amplification kit (Clontech Laboratories Inc, Palo Alto, CA). 

Amino acid sequence alignments between ATPT2 (SEQ ID NO:2), ATPT3 (SEQ ID 
15 - ]V0:4), ATPT4 (SEQ ID N0:6), ATPT8 (SEQ ID N0:12), and ATPTI2 (SEQ ID NO:17) are 
performed using ClustalW (Figure 1), and the percent identity and similarities are provided in 
Table 1 below. 



Table 1: 



ATPT2 


ATPT3 


ATPT4 


ATPT8 


ATPT12 


ATPT2 % Identity 


12 


13 


11 


15 


% similar 


25 


25 


22 


32 


%Gap 


17 


20 


20 


9 


ATPT3 % Identity 




12 


6 


22 


% similar 




29 


16 


38 


% Gap 




20 


24 


14 


ATPT4 % Identity 






9 


14 


% similar 






18 


29 


% Gap 






26 


19 


ATPT8 % Identity 








7 
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% similar 


19 


% Gap 


20 


A'I PT12 % Identity 




% similar 




%Gap 





Example 2: Preparation of Prenyl Transferase Expression Constructs 

A plasmid containing the napin cassette derived from pCGN3223 (described in USPN 
5,639,790, the entirety of which is incorporated herein by reference) was modified to make it 
more useful for cloning large DNA fragments containing multiple restriction sites, and to allow 
the cloning of multiple napin fusion genes into plant binary transformation vectors. An adapter 
comprised of the self annealed oligonucleotide of sequence 

CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCATTTAAAT 
(SEQ ID NO:40) was ligated into the cloning vector pBC SK+ (Stratagene) after digestion with 
- the restriction endonuclease BssHII to construct vector pCGN7765. Plamids pCGN3223 and 
pCGN7765 were digested with MotI and ligated together. The resultant vector, pCGN7770, 
contains the pCGN7765 backbone with the napin seed speciGc expression cassette from 
pCGN3223. 

The cloning cassette, pCGN7787, essentially the same regulatory elements as 
pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have been replaced 
with the double CAMV 35S promoter and the tml polyadenylation and transcriptional 
termination region. 

A binary vector for plant transformation, pCGN5139, was constructed from pCGN1558 
(McBride and Summerfelt, (1 990) Plant Molecular Biology, 14:269-276). The polylinker of 
pCGNl 558 was replaced as a HindIII/Asp71 8 fragment with a polylinker containing unique 
restriction endonuclease sites, AscI, Pad, Xbal, Swal, BamHI, and Notl. The Asp718 and 
Hindlll restriction endonuclease sites are retained in pCGN5139. 

A series of turbo binary vectors are constructed to allow for the rapid cloning of DNA 
sequences into binary vectors containing transcriptional initiation regions (promoters) and 
transcriptional termination regions. 
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The plnsmid pCGN8618 was conslriiclcd by ligaling oligonuclcolides 5'- 
TCGAGGATCCGCGGCCGCAAGCTICCTGCAGG-^ 

TCGACCI'GCAGGAAGC ITGCGGCCGCGGA'rCC-3' (SEQ ID NO:42) into Sall/Xhol- 
digested pCGN7770. A fragmcnl conlaining Ihe napin promoter, polylinker and hapin 3' region 
5 was excised from pCGN861 8 by digestion wilh Asp7 1 81; the Tragment was blunt-ended by filling 
in the 5' overhangs with Klenow fragment then ligated into pCGNS 139 that had been digested 
with Asp71 81 and Hindlll and blunt-ended by filling in the 5' overhangs with Klenow fragment. ^ 
A plasmid containing the insert oriented so that the napin promoter was closest to the blunted 
Asp71 81 site of pCGNS 1 39 and the napin 3' was closest to tlie blunted Hindlll site was subjected 

10 to sequence analysis to confirm both the insert orientation and the integrity of cloning junctions. 
The resulting plasmid was designated pCGN8622. 

The plasmid pCGN861 9 was constructed by ligating oligonucleotides 5*- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' (SEQ ID NO:43) and 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' (SEQ IDNO:44) into Sall/Xhol- 

15 • digested pCGN7770. A fragment containing the napin promoter, polylinker and napin 3' region 
was removed from pCGN8619 by digestion with Asp? 181; the fragment blunt-ended by 
filling in the 5* overhangs with Klenow fragment then ligated into pCGN5139 that had been 
digested with Asp718I and Hindlll and blunt-ended by filling in the 5* overhangs with Klenow 
fragment. A plasmid containing the insert oriented so that the napin promoter was closest to the 

20 blunted Asp7181 site of pCGN5139 and the napin 3* was closest to the blunted Hindlll site was 
subjected to sequence analysis to confirm both the insert orientation and the integrity of cloning 
junctions. The resulting plasmid was designated pCGN8623. 

The plasmid pCGN8620 was constructed by ligating oligonucleotides 5*- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3* (SEQ ID NO:45) and 5'- 

2 5 CCTGCAGGAAGCTTGCGGCCGCGGATCC-3' (SEQ ID NO:46) into Sall/SacI-digested 
pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region was 
removed from pCGN8620 by complete digestion with Asp718I and partial digestion with Noll. 
The fragment was blunt-ended by filling in the 5' overhangs with Klenow fiagment then ligated 
into pCGN5l39 that had been digested with Asp718I and Hindlll and blunt-ended by filling in 

30 the 5^ overhangs with Klenow fragment. A plasmid containing tlie insert oriented so that the 
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d35S promoter wns closest to the blunted Asp718I site orpCGN5I39 and Ihc Iml 3' was closest 
lo the blunted Ilindlll site was subjected to sequence analysis to conrirm both the insert 
orientation and the integrity ofcloning junctions. The resulting plasmid was designated 
pCGN8624. 

5 The plasmid pCGN8621 was constructed by ligaling oligonueleoUdes 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3* (SEQ ID NO:47) and 5'- 
GGATCCGCGGCCGCAAGCITCCTGCAGG-3* (SEQ ID NO:48) into Sall/Sael-digested 
pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region was 
removed from pCGN8621 by complete digestion with Asp7l81 and partial digestion withNotl. 
10 The fragment was blunt-ended by filling in the 5* overhangs with Klenow fragment then ligated 
• into pCGN5]39 that had been digested with Asp718I and Hindlll and blunt-ended by filling in 
the 5* overhangs with Klenow fragment. A plasmid containing the insert oriented so that the 
d35S promoter was closest to the blunted Asp71 81 site of pCGNS 1 39 and the tml 3' was closest 
to the blunted Hindlll site was subjected to sequence analysis to confirm both the insert 
. 15 . orientation and the integrity of cloning junctions. The resulting plasmid was designated 
pCGN8625, 

The plasmid construct pCGN8640 is a modification of pCGN8624 described above. A 
938bp PstI fragment isolated from transposon Tn7 which encodes bacterial spectinomycin and 
streptomycin resistance (Fling et al. (1985), Nucleic Acids Research 13(19):7095-7106X a 
20 determinant for E. coli and Agrobacterium selection, was blunt ended with Pfu polymerase. The 
blunt ended Augment was ligated into pCGN8624 that had been digested with Spel and blunt 
ended with Pfu polymerase. The region containing the PstI fragment was sequenced to confinn 
both the insert orientation and the integrity of cloning junctions. 

The spectinomycin resistance marker was introduced into pCGN8622 and pCGN8623 as 
25 follows. A 7.7 Kbp Avrll-SnaBI fragment from pCGN8640 was ligated to a 10.9 Kbp Avrll- 
SnaBI fragment from pCGN8623 or pCGN8622, described above. The resulting plasmids were 
pCGN8641 and pCGN8643, respectively. 

The plasmid pCGN8644 was constructed by ligating oligonucleotides 5'- 
GATCACCTGCAGGAAGCTTGCGGCCGCGGATCCAATGCA-3' (SEQ ID NO:49) and 5'- 
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TTGGATCCGCGGCCGCAAGCTrCCTGCAGCT-3^ (SEQ ID NO;50) inlo Bamlll-PslI 
digested pCGN8640. 

Synlhclic oligonulceolidcs were designed for use in Polymerase Chain Reactions (PGR) to 
amplify the coding sequences of ATP17. A fPTS, ATP*r4, ATPT8, and A1TT12 for Ihe preparation 
of expression constructs and are provided in Table 2 below. 



Table 2: 



Name 


Restriction Site 


Sequence 


SEQ ID NO: 


ATPT2 


5* Noll 


GGATCCGCGGCCGCACAATGGAGTC 
TCTGCTCTCTAGTTCT 


51 


ATPT2 


3'Ssel 


GGATCCTGCAGGTCACTTCAAAAAA 
GGTAACAGCAAGT 


52 


ATPT3 


5'Noll 


GGATCCGCGGCCGCACAATGGCGTT 
11 riGGGCTCTCCCGTGTTT 


53 


ATPT3 


3'SseI 


GGATCCTGCAGGTTATTGAAAACTT 
CTTCCAAGTACAACT 


54 


ATPT4 


5' NotI 


GGAtCCGCGGCCGCACAATGTGGCG 
AAGATCTGTTGTT 


55 


ATPT4 


3'Ssel 


GGATCCTGCAGGTCATGGAGAGTAG 
AAGGAAGGAGCT 


56 


ATPT8 


5'NoU 


GGATCCGCGGCCGCACAATGGTACT 
TGCCGAGGTTCCAAAGCTTGCCTCT 


57 


ATPT8 


3'SseI 


GGATCCTGCAGGTCACTTGTTTCTG 
GTGATGACTCTAT 


58 


ATFn2 


5' NotI 


GGATCCGCGGCCGCACAATGACTTC 
GATTCTCAACACT 


59 


ATPT12 


3' Ssel 


GGATCCTGCAGGTCAGTGTTGCGAT 
GCTAATGCCGT 


60 



The coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and ATPT12 were all amplified 
using the respective PGR primers shown in Table 2 *above and cloned into the TopoTA vector 
(Invitrogen). Constructs containing the respective prenyltransferase sequences were digested with 
NotI and Sse8387l and cloned into the turbobinary vectors described above. 

The sequence encoding ATPT2 prenyltransferase was cloned in the sense orientation into 
pCGN8640 to produce the plant transformation construct pCGNIOSOO (Figure 2). The ATPT2 
sequence is under control of the 35S promoter. 
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The ATPT2 sequence was also cloned in (he anlisense orienlalion into Ihe conslnicl 
pCGN8641 lo crcalc pCGN 1 0801 (Figure 3). This conslrucl provides for Ihe anlisense expression of 
Ihe ATPT2 sequence From the napin promolcr. 

The ATPT2 coding sequence was also cloned in the sense orienlalion into the vector 
5 pCGN8643 lo create the plant transformation construct pCGN 10822 

The ATPT2 coding sequence was also cloned in the anlisense orientation into the vector 
pCGN8644 lo create the plant transformation construct pCGN 1 0803 (Figure 4). 

The ATPT4 coding sequence was cloned into the vector pCGN864 to create the plant 
transformation construct pCGN10806 (Figure 5). The ATPT2 coding sequence was cloned into the 

10 vector TopoTA ™ vector from Invitrogen, to create the plant transformation construct 

pCGN10807(Figure 6). The ATPT3 coding sequence was cloned into the TopoTA vector to create 
the plant transformation construct pGGN 10808 (Figure 7). The ATPT3 coding sequence was cloned 
in the sense orientation into the vector pCGN8640 to create the plant transformation construct 
pCGN 10809 (Figure 8). The ATPT3 coding sequence was cloned in the anlisense orientation into the 

15 . vector pCGN8641 to create the plant transformation construct pCGN10810 (Figure 9). The ATPT3 
coding sequence was cloned into the vector pCGN8643 to create the plant transformation construct 
pCGNlOSl 1 (Figure 10). The ATPT3 coding sequence was cloned into the vector pCGN8644 to 
create the plant transformation construct pCGN10812 (Figure 1 1). The ATPT4 coding sequence was 
cloned into the vector pCGN8640 to create the plant transformation construct pCGN 10813 (Figure 

20 12). The ATPT4 coding sequence was cloned into the vector pCGN8641 to create the plant 

transformation construct pCGNl 08 14 (Figure 13). The ATPT4 coding sequence was cloned into the 
vector pCGN8643 to create the plant transformation construct pCGN 1 08 1 5 (Figure 1 4). The ATPT4 
coding sequence was cloned in the anlisense orientation into the vector pCGN8644 lo create the 
plant transfonmation construct pCGN 10816 (Figure 1 5). The ATPT8 coding sequence was cloned in 

25 the sense orientation into the vector pCGN8643 to create the plant transformation construct 

pCGN10819 (Figure 17). The ATPT12 coding sequence was cloned into the vector pCGN8640 to * . 
create the plant transformation construct pCGN 1 0824 (Figure 1 8). The ATPT12 coding sequence 
was cloned into the vector pCGN8643 to create the plant transformation construct pCGN 10825 
(Figure 19). The ATPT8 coding sequence was cloned into the vector pCGN8640 lo create the plant 

30 transformation construct pCGN 10826 (Figure 20). 
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lilxnmpic 3: Plant Transfomialion wilh Prenyl Translbrasc Conslaicls 

lYansgenic Brassica plants arc obtained by Agrohacterittm-mcdxtxl^id Iransformation as 
5 descnhcdhyRadkGef aL{Theor.AppL Genet. (m%)75:6Z5-694\Plan^ 

1 1 :499-505). Transgenic Arabidopsis thaliana plants may be obtained by Agrobacterium- 
mediated transfomnation as described by Valverkens et al.^ (Proa NdL Acad ScL (1988) 
55:5536-5540), or as described by Bent et al. ((1994), Science 265: 1 856-1 860), or Bechtold et al. 
(( 1 993), CRAcadSci, Life Sciences 3 16:1 194-1 1 99). Other plant species may be similarly 
10 transformed using related techniques. 

Alternatively, microprojectile bombardment methods, such as described by Klein c/ aL 
{Bio/Technology 70:286-291) may also be used to obtain nuclear transformed plants. 

Example 4: Identification of Additional Prenyltransferases 
1 5 - ; Additional BLAST searches were performed using the ATPT2 sequence, a sequence in 
the class of aromatic prenyltransferases. ESTs, and in some case, full-length coding regions, 
were identified in proprietary DNA libraries. 

Soy fiiU-length homologs to ATPT2 were identified by a combination of BLAST (using 
ATPT2 protein sequence) and 5* RACE. Two homologs resulted (SEQ ID NO:95 and SEQ ID 
2 0 NO:96). Translated amino acid sequences are provided by SEQ ID NO:97 and SEQ ID NO:98. 

A rice est ATPT2 homolog is shown in SEQ ID NO:99 (obtained fi^m BLAST using the 
wheat ATPT2 homolog). 

Other homolog sequences were obtained using ATPT2 and PSI-BLAST, including est 
sequencesfr>)^ wheat(SEQIDNO:100),leek(SEQlDNOs:101 and 102), canola(SEQID 
25 NO: 103), com '(SEQ ID NOs:104, 105 and 106), cotton (SEQ ID NO: 107) and tomato (SEQ ID 
NO: 108). 

A PSI-Blasl profile generated using the £ coli ubi A (genbank accession 1 790473) 
sequence was used to analyze the Synechocysiis genome. This analysis identified 5 open reading 
frames (ORFs) in the Synechocysiis genome that were potentially prenyltransferases; slK)926 
30 (annotated as ubi A (4-hydroxyben2oate-octaprenyllransferasc, SEQ ID NO:32), sill 899 
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Cannolalccl as claB (cylocronie c oxidase foldinyi protein, SCQ ID NO:33), sliO056 (annotaled as 
g4 (chlomphyll synlhase 33 kd subuiiil, SliQ ID NO:34), sir 1 5 1 8 (annolalcd as meiiA 
(mcnaquinonc biosynthesis protein, SUQ ID NO:35), and slrl736 (annolalcd as a hypolhelical 
prolein of unknown funclion (SEQ ID NO:36). 

■ 5 

4A. Synec/wcysiis Knookrouls 

To delermine Ihe funclionalily ofthese ORFs and their involvement, if any, in the 
biosynthesis of tocopherols, knockouts constructs were made to disrupt the ORF identified in 
Synechocysiis. 

.10 Synthetic oligos were designed to amplify regions from the 5* (5'- 

TAATGTGTACATTGTCGGCCTC (17365') (SEQ ID N0:61) and 5'- 

GCAATGTAACATCAGAGATTTrGAGACACAACGTGGCTTTCCACAATTCCCCGCACC 
GTC (1 736kanprl)) (SEQ ID NO:62) and 3^ (5'-AGGCTAATAAGCACAAATGGGA (17363') 
(SEQ ID NO:63) and 5'-GGTATGAGTCAGCAACACCTTCTTCACGAGGeAGACCTCAGC 

15 . GGAATTGGTTTAGGTTATCCC(1736kanpr2))(SEQlDNO:64)endsofthesIrl736 0RF. 
The 1736kanprl and 1736kanpr2 oligos contained 20 bp of homology to the slrl736 ORF with 
an additional 40 bp of sequence homology to the ends of the kanamycin resistance cassette. 
Separate PGR steps were completed with these oligos and the products were'gel purified and 
combined with the kanamycin resistance gene from puc4K (Pharmacia) that had been digested 

20 with Hindi and gel purified away from the vector backbone. The combined fragments were 
allowed to assemble without oligos under the following conditions: 94^*0 for 1 min, 55^*0 for 1 
min, 72**C for 1 min plus 5 seconds per cycle for 40 cycles using pfii polymerase in lOOul 
reaction volume (Zhao, H and Arnold (1997) Nucleic Acids Res. 25(6): 13 07- 1308). One 
microliter or five microliters of this assembly reaction was then amplified using 5' and 3' oligos 

25 nested within the ends of the ORF fiagment, so that the resulting product contained 100-200 bp 
of the S' end of the Synechocy»stis gene to be knocked out, the kanamycin resistance cassette, and 
1 00-200 bp of the 3* end of the gene to be knocked out Tliis PGR product was then cloned into 
the vector pGemT easy (Promega) to create the construct pM0N2 1681 and used for 
Symchocystis transformation. 
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Primers were also synthesized for the preparation ofiSynechocystis knockout conslrucls 
for liic other sequences using the same method as described above, with the following primers. 
'ITie ubiA 5" sequence was amplilied using the primers S'-GGATCCATGG'fT 
GCCCAAACCCCATC (SEQ ID NO:65) and 5'- GCAArGTAACA fCAGAGA 
5 TTTTGAGACACAACG TGGCriTGGGTAAGCAACAATGACCGGC (SEQ ID NO:66). 
The 3' region was amplified using the synthetic oligonucleotide primers 5'- 
GAATTCTCAAAGCCAGCCCAGTAAC (SEQ ID NO:67) and 5'-GGTATGAGTC 
AGCAACACCTTCTTCACGAGGCAGACCTCAGCGGGTGCGAAAAGGGTTTTCCC (SEQ 
ID NO:68). The amplification products were combined with the kanamycin resistance gene fix)m 

10 puc4K (Pharmacia) that had been digested with Hindi and gel purified away from the vector 

backbone. The annealed fragment was amplified using 5' and 3* oligos nested within the ends of 
the ORF fragment (5'- CCAGTGGTTTAGGCTGTGTGGTC (SEQ ID NO:69) and 5'- 
CTGAGTTGGATGTATTGGATC (SEQ ID NO:70)), so that the resulting product contained 
100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the kanamycin resistance 

15 • cassette, and 100-200 bp of the 3* end of the gene to be knocked out This PGR product was then 
cloned into the vector pGemT easy (Promega) to create the construct pMON21682 and used for 
Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout constructs 
for the other sequences using the same method as described above, with the following primers. 

20 The si 1 1 899 5' sequence was amplified using the primers 5*- GGATCCATGGTTACTT 
CGACAAAAATCC (SEQ ID NO:7]) and 5'- GCAATGTAACATCAGAG 
ATTTTGAGACACAACGTOGCTTTGCTAGGCAACCGCTTAGTAC (SEQ ID NO:72), The 
3' region was amplified using the synthetic oligonucleotide primers 5'- 
GAATTCTTAACCCAACAGTAAAGTTCCC (SEQ ID NO:73) and 5*- GGTATGAGTCAGC 

2 5 A ACACCTTCTTCACGAGGCAGACCTC AGCGCCGOCATTGTCTTTTACATG (SEQ ID 
NO:74). The amplification products were combined with the kanamycin resistance gene from 
puc4K (Pharmacia) that had been digested with ^mcll and gel purified away from the vector 
backbone. The annealed fragment was amplified using 5* and 3' oligos nested within the ends of 
the ORF fragment (5'- GGAACCCTTGCAGCCGCTTC (SEQ ID NO:75) 
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and 5'- GTAtGCCCAACTGGTGCAGAGG (SEQ ID NO:76)), so thai the rcsuUing product 
contained 100-200 bp of the 5' end of Ihc iSynudwcysfis gene to be knocked ouU Ihe knnamycin 
resistance cassetlc, and 100-200 bp of the 3' end of the gene to be knocked out. 'ITiis PGR 
product was then cloned into the vector pGcniT easy (Promega) to create the construct 
5 pM0N2 1679 and used for Syncchocysds Iransfonnation. 

Primers were also synthesized for the preparation of Synechocysl is knockout constructs 
for the other sequenceis using the same method as described above, with the following primers. 
The slr0056 5' sequence was amplified using the primers 5'- 
GGATCCATGTCTGACACACAAAATACCG (SEQ ID NO:77) and 5*. 

10 GCAATGTAACATCAGAGATTTTGAGACACAACGTGGCTTTCGCCAATACCAGCCACC 
AACAG (SEQ ID NO:78). The 3' region was amplified using the synthetic oligonucleotide 
primers 5'- GAATTCTCAAAT CCCCGCATGGCCTAG (SEQ ID NO:79) and 5'- 
GGTATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGCGGCCTACGGCTTGGA 
CGTGTGGG (SEQ ID NO:80). The amplification products were combined with the kanamycin 

1 5 • resistance gene from puc4K (Pharmacia) that had been digested with HincR and gel purified 
away from the vector backbone. The annealed fragment was amplified using 5' and 3' oligos 
nested within the ends of the ORF Augment (5*- CACTTGGATTCCCCTGATCTG (SEQ ID 
N0:81) and 5'- GCAATACCCGCTTGGAAAACG (SEQ ID NO:82)). so that the resulting 
product contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the 

20 kanamycin resistance cassette, and 100-200 bp of the 3* end of the gene to be knocked out. This 
PGR product was then cloned into the vector pGemT easy (Promega) to create the construct 
pM0N2 1 677 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout constructs 
for the other sequences using the same method as described above, vsdth the following primers. 

25 The slrl518 5* sequence was amplified using the primers 5'- GGATCCATGACCGAAT 
CTTCGCCCCTAGC (SEQ ID NO:83) and 5*-GCAATGTAACATCAGAGATTrrGA 
GACACAACGTGGC TTTCAATCCTAGGTAGCCGAGGCG (SEQ ID NO:84). The 3' region 
was amplified using the synthetic oligonucleotide primers 5*- GAATTCTTAGCCCAGGCC 
AGCCCAGCC (SEQ ID NO:85)and 5*- GGTATGAGTCAGCAACACCTTCTTCACGA 

30 GGCAGACCTCAGCGGGGAAITGATTTGTTTAATI ACC (SEQ ID NO:86). The 
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amplification products were combined with the kanamycin resistance gene from puc4K 
(Pharmacia) thai had been digested with Hindi and gel purined away from the vector backbone. 
Hie annealed fragment was amplified using 5' and 3* oligos nested within the ends of the ORI- 
fragment (5'- GCGATCGCCATrATCGdTGG (SEQ ID NO:87) and 5^ 
5 GCAGACTGGCAATTATCAGTAACG (SEQ ID NO:88)), so lhal the resulting product 

contained 100-200 bp of the 5* end of tlie Synechocysiis gene to be knocked out, the kanamycin 
resistance cassette, and 100-200 bp of the 3* end of the gene to be knocked out. This PGR 
product was then cloned into the vector pGcmT easy (Promega) to create the construct 
pMOM21680 and used for Synechocysiis transformation. 

10 . 

4B. Transformation of iSynec/ioc)'^/w 

Cells of Synechocysiis 6803 were grown to a density of approximately 2x10* cells per ml 
and har\'ested by centrifiigation. The cell pellet was re-suspended in fresh BG-1 1 medium 
(ATCC Medium 616) at a density of 1x10^ cells per ml and used immediately for transformation. 

15 . Or\e-hundred microliters of these cells were mixed with 5 ul of mini prep DMA and incubated 
with light at 30C for 4 hours. This mixture was then plated onto nylon filters resting on BG-1 1 
agar supplemented with TES pH8 and allowed to grow for 12-1 8 hours. The filters were then 
transferred to BG-1 1 agar + TES + 5ug/ml kanamycin and allowed to grow until colonies 
appeared within 7-10 days (Packer and Glazer, 1 988). Colonies were then picked into BG-1 1 

20 liquid media containing 5 ug/ml kanamycin and allowed to grow for 5 days. These cells were 
then transferred to Bg-1 1 media containing lOug/ml kanamycin and allowed to grow for 5 days 
and then transferred to Bg-l 1 + kanamycin at 25ug/ml and allowed to grow for 5 days. Cells 
were then harvested for PCR analysis to detenmine the presence of a disrupted ORF and also for 
HPLC analysis to determine if the disruption had any effect on tocopherol levels, 

2 5 PCR analysis of the Synechocysiis isolates for sir 1 736 and sll 1 899 showed complete 

segregation of the mutant genome, meaning no copies of the wild type genome could be detected 
in these strains. This suggests that function of the native gene is not essential for cell fiinction. 
HPLC analysis of these same isolates showed that the sll 1899 strain had no detectable reduction 
in tocopherol levels. However, the strain carrying the knockout for slrl 736 produced no 

30 detectable levels of tocopherol. 
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amino acid sequences for the Synechacysfis knockouts are compared using Clustal W, 
and are provided in Table 3 below. Provided are the perecnl idenlilies, percent similarity, and the 
percent gap. The alignment of the sequences is provided in Mgurc 21. 

5 Table 3: 



Slrl736 


slr0926 


sill 899 


slr0056 


slrlSIS 


slrl736%identily 


14 


12 


18 


11 


%similar 


29 


30 


34 


26 


%gap 


8 


7 


10 


5 


slr0926 %identity 




20 


19 


14 


%similar 




39 


32 


28 


%gap 




7 


9 


4 


slll8?9 %identity 






17 


13 


%similar 






29 


29 


%gap 






12 


9 


slr0056 %identity 








15 


%siniilar 








31 


%gap 








8 



slrl518%identity 



%similar 
%gap 



Amino acid sequence comparisons are performed using various Arabidopsis 
prenyltransferase sequences and the Synechocystis sequences. The comparisons axe presented in 
Table 4 below. Provided are the percent identities, percent similarity, and the percent gap. The 
10 alignment of the sequences is provided in Figure 22. 



Tabic 4: 



ATPT2 slrI736 


ATPT3 


slr0926 


ATPT4 


$111899 


ATPTI2 


slr0056 


ATPT8 


slrlSlS 


ATPT2 29 


9 


9 


8 


8 


12 


9 


7 


9 
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46 


23 


21 


20 


20 


28 ■ 


23 


21 


20 




27 


13 


28 


23 


29 


11 


24 


25 


24. 


Slrl736 




9 


13 


8 


12 


13 


15 


8 


10 






19 


28 


19 


28 


26 


33 


21 


26 






34 


12 


34 


15 


26 


10 


12 


10 


ATPT3 






23 


11 


14 


13 


10 


5 


11 








36 


26 


26 


26 


21 


14 


22 








29 


21 


31 


16 


30 


30 


30 










12 


20 


17 


20 


11 


14 


sli0926 








24 


37 


28 


33 


24 


2? 










33 


12 


25 


10 


11 


9 












18 


11 


8 


6 


7 


ATPT4 










33 


23 


18 


16 


19 












28 


19 


32 


32 


33 














13 


17 


10 


12 


sill 899 












24 


30 


23 


26 














27 


13 


10 


11 
















52 


8 


11 


ATPTI 














66 


19 


26 



2 

18 25 23 

9 13 

sli0056 23 32 

10 8 
7 

ATPT8 23 

7 

sIrlSIS 
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4C. Phylyl Preriyllranslerase Enzyme Assays 

[■*H] Homogenlisic acid in 0. 1% H3PO4 (specific radioactivity 40 Ci/mmol). Phytyl 
5 pyrophosphate was synthesized as described by Joe, et aL (1 973) Can J, Biochem, 51 : 1527. 2- 
methyl-6-phytylquinol and 2,3-dimethyl-5-phytylquinol v/ere synthesized as described by Soil, el 
aL (1 980) Phytochemislry 1 9:2 1 5. Homogentisic acid, a, P, 5, and y-tocopherol, and tocol, were 
purchased commercially. 

The wild-type strain otS^ynechocysiis sp. PCC 6803 was grown in BGl 1 medium with 

10 bubbling air at 30°C under 50 ^E.m*^.s*' fluorescent light, and 70% relative humidity. The growth 
medium of sir 1736 knock-out (potential PPT) strain of this organism was supplemented with 25 
lag mU^ kanamycin. Cells were collected from 0.25 to 1 liter culture by centrifugation at 5000 g 
for 10 min and stored at -80°C. 
• • Total membranes were isolated according to Zak's procedures with some modifications (Zak, 

15 ei aL (1 999) Eur J, Biochem 261 :3 1 1 ). Cells were broken oh a French press. Before the French 
press treatment, the cells were incubated for 1 hour with lysozyme (0.5%, w/v) at 30 ^'C in a 
medium containing 7 mM EDTA, 5 mMNaCl and 10 mM Hepes-NaOH. pH 7.4. The 
spheroplasts were collected by centrifugation at 5000 g for 10 min and resuspendedat 0.1 - 0.5 mg 
chlorophyll-mL'' in 20 mM potassium phosphate buffer, pH 7.8. Proper amount of protease 

2 0 inhibitor cocktail and DNAase I from Boehringer Mannheim were added to the solution. French 
press treatments were performed two to three times at 100 MPa. After breakage, the cell 
suspension was centrifuged for 10 min at 5000g to pellet unbroken cells, and this was followed by 
centrifugation at 1 00 000 g for 1 hour to collect total membranes. The final pellet was resuspended 
in a buffer containing 50 mM Tris-HCL and 4 mM MgClj. 

2 5 Chloroplast pellets were isolated from 250 g of spinach leaves obtained from local markets. 

Devined leaf sections were cut into grinding buffer (2 1 /250 g leaves) containing 2 mM EDTA, 1 
mM MgCb, 1 mM MnCU, 0.33 M sorbitol, 0. 1% ascorbic acid, and 50 mM Hepes at pH 7.5. The 
leaves were homogenized for 3 sec three times in a 1-L blendor, and filtered through 4 layers of 
mirocloth. The supernatant was then centriftiged at 5000^ for 6 min. The chloroplast pellets were 
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rcsiispcndcd in small amount ofgrinding buffer (Douccc/ al Methods in Chloroplasl Molecular 
niology, 239 (1982) 

Chloroplasls in pellets can be broken in three ways. Chloroplasl pellets were first aliquotcd 
in 1 mg ofchlorophyll per lube, ccnlrifuged at 6000 rpm for 2 min in microcentriruge, and 
5 grinding buffer was removed. Two hundred microliters of Triton X-l 00 buffer (0. 1% Triton X- 
100. 50 mM Tris-HCl pH 7.6 and 4 mM MgCb) or swelling buffer (10 mM Tris pH 7.6 and 4 
mM MgCb) was added to each tube and incubated for V4 hour at 4°C. Then the broken 
chloroplast pellets were used for the assay immediately. In addition, broken chlorpplasts can also 
be obtained by freezing in liquid nitrogen iand stored at -80**C for 14 hour, then used. for the assay. 

10 In some cases chloroplast pellets were further purified with 40%/ 80% percoU gradient to 
obtain intact chloroplasts. The intact chloroplasts were broken with swelling buffer, then either 
used for assay or further purified for envelope membranes with 20.5%/ 3 1 .8% sucrose density 
gradient (Sol, ei al (1980) supra). Tlie membrane fractions were centrifuged at 100 OOOg for 40 
min and resuspended in 50 mM Tris-HCl pH 7.6, 4 mM MgC^. 

15 - : . Various amounts of [^H]HGA, 40 to 60 |iM unlabelled HGA with specific activity in the 
range of 0.16 to 4 Ci/mmole were mixed with a proper amount of IM Tris-NaOH pH 10 to adjust 
pH to 7,6. HGA was reduced for 4 min with a trace amount of solid NaBRi. In addition to HGA, 
standard incubation mixture (final vol 1 mL) contained 50 mM Tris-HCl, pH 7.6, 3-5 mM MgCh, 
and 100 jiM phytyl pyrophosphate. The reaction was initiated by addiiioh of Synechocysi is total 

2 0 membranes, spinach chloroplast pellets, spinach broken chloroplasts, or ispinach envelope 

membranes. The enzyme reaction was carried out for 2 hour at 23°C or 30®C in the dark or light 
The reaction is stopped by freezing with liquid nitrogen, and stored at -80*'C or directly by 
extraction. 

A constant amount of tocol was added to each assay mixture and reaction products were 
25 extracted with a 2 mL mixture of chloroform/methanol (1 :2, v/v) to give a monophasic solution. 
NaCl solution (2 mL; 0.9%) was added with vigorous shaking. This extraction procedure wais 
repeated three times. The organic layer containing the prenylquinones was filtered through a 20 
mn filler, evaporated under N2. and then resuspended in 100 jiL of ethanol. 

The samples were mainly analyzed by Nonnal-Phase HPLC method (Isocratic 90% Hexane 
30 and 10% Mclhyl-t-buiyI ether), and use a Zorbax silica column, 4.6 x 250 mm. The samples were 
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also analyzed by Reversed-Phosc HPLC method (Isocralic 0. 1% II3PO4 in McOH). and use a 
Vydac 201 1 1S54 CI 8 column; 4.6 x 250 mm coupled wilh an All-lcch C18 guard column. Tlic 
amount ofproducts were calculated based on the substrate specific radioactivity, and adjusted 
' according to the % recovery based on the amount of internal standard. 
5 The amount of chlorophyll was determined as described in Amon (1949) Plant Physiol. 24: 1 . 

Amount of protein was determined by the Bradford method using gamma globulin as a standard 
(Bradford, (1976) AnaL Diochem. 72:248) 

Results of the assay demonstrate that 2-Methyl-6-Phytylplastoquinone is not produced in 
the Synechocyslis slrl736 knockout preparations. The results of the phytyl prenyltransferase 
10 enzyme activity assay for the slrl 736 knock out are presented in Figure 23. 

4D. Complementation of the slrl 736 knockout with ATPT2 

In order to determine whether ATPT2 could complement the knockout of slrl 736 in 
Synechocyaths 6803^ a plasmid was constructed to express the ATPT2 sequence from the TAC 

15 - promoter. A vector, plasmid psll21 1, was obtained from the lab of Dr. Himadri Pakrasi of 
Washington University, and is based on the plasmid RSFIOIO which is a broad host range 
plasmid (Ng W.-O., Zentella R., Wang, Y., Taylor J-S. A., Pakrasi, H.B. 2000. phrA, the major 
photoreactivating factor in the cyanobacterium Synechocysiis sp, strain PCC 6803 codes for a 
cyclobutane pyrimidine dimer specific DNA photolyase. Arck Microbiol (in press)). The 

2 0 ATPT2 gene was isolated from the vector pCGN 1 08 1 7 by PCR using the follovWng primers. 
ATPT2nco.pr 5*-CCATGGATTCGAGTAAAGTTGTCGC (SEQ ID NO;89); ATPT2ri.pr- 5'- 
GAATTCACTTCAAAAAAGGTAAC AG (SEQ ID NO:90). These primers will remove 
approximately 1 12 BP from the 5* end of the ATPT2 sequence, which is thought to be the 
. chloroplast transit peptide. These primers will also add an Ncol site at the 5' end and an EcoEU 

25 site at the 3* end which can be used for sub-cloning into subsequent vectors. The PCR product 
from using these primers and pCGNI0817 was ligated into pGEM T easy and the resulting 
vector pM0N2 1 689 was confirmed by sequencing using the m 1 3forward and m 1 3reverse 
primers. The NcoI/EcoRI fragment from pMON21689 was then ligated with the Eagl/EcoRl 
and Eagl/Ncol fragments from psll21 1 resulting in pMON21690. The plasmid pMON21690 

30 was introduced into the slrl 736 Synechocyslis 6803 KG strain via conjugation. Cells of sl906 (a 
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helper slrain) and DM I OB cells containing pMON2l690 were grown lo log phase (O.D. 600= 
0.4) and I ml was harvested by cenlrifugalion. The cell pellets were washed twice with a sterile 
BG-1 1 solution and rcsuspendcd in 200 ui ofBG-l I. The following was mixed in a sterile 
cppendorf tube: 50 ul SL906, 50 ul Dill OB cells containing pMON2l690, and 100 ul of a fresh 
5 culture of the sir 1 736 Synecbocysiis 6803 KO strain (0,D. 730 0.2-0.4). The cell mixture was 
immediately transferred to a nitrocellulose filter resting on BG-l 1 and incubated for 24 houre at 
30C and 2500 LUX(50 ue) of light. The filter was then transferred to BG-i 1 supplemented with 
1 Oug/ml Gentamycin and incubated as above for -5 days. When colonies appeared, they were 
picked and grown up in liquid BG-1 1 + Gentamycin 10 ug/ml. (Elhai, J. and Wolk, P. 1988. 

10 Conjugal transfer of DNA to Cyanobacteria: Methods in Enzymology 1 67, 747-54) The liquid 
cultures were then assayed for tocopherols by harvesting 1 ml of culture by centrifugation, 
extracting with ethanol/pyrogallol, and HPLC separation. The slrl736 Synecbocysiis 6803 KO 
strain, did not contain any detectable tocopherols, while the sir 1736 Synechocystis 6803 KO 
strain transformed with pmon21690 contained detectable alpha tocopherol. A Synechocystis 

15 - 6,803 strain transformed with psll21 1 (vector control) produced alpha tocopherol as well. 

4E: Additional Evidence of Prenyltransferase Activity 

To test the hypothesis that slrl736 or ATPT2 are sufficient as single genes to obtain 
20 phytyl prenyltransferase activity, both genes were expressed in SF9 cells and in yeast. When 
either slrl736 or ATPT2 were expressed in insect cells (Table 5) or in yeast, phytyl 
prenyltransferase activity was detectable in membrane preparations, whereas membrane 
preparations of the yeast vector control, or membrane preparations of insect cells did not exhibit 
phytyl prenyltransferase activity. 

25 

Tabic 5: Phytyl prenyltransferase activity 

Enzyme source Enzyme activity 

[pmol/mgx hj 

slrl 736 expressed in SF9 cells 20 

ATPT2 expressed in SF9 cells 6 

SF9 cell control <0.05 
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Syncchocysiis 6803 0.25 

Spinach chloroplasts 0.20 

Examples: Transgenic Planl Analysis 
5A. Arabidopsis 

5 Arabidopsis plants Iransformed wilh constructs for the sense or anlisense expression of 

the ATPT proteins were analyzed by High Pressure Liquid Chromatography (HPLC) for altered 
levels of total tocopherols, as well as altered levels of specific tocopherols (alpha, beta, gamma, 
and delta tocopherol). 

Extracts of leaves and seeds were prepared for HPLC as follows. For seed extracts, 10 
10 mg of seed was added to 1 g of microbeads (Biospec) in a sterile microfuge tube to which 500 ul 
1% pyrogallol (Sigma Chem)/cthanol was added. Tlie mixture was shaken for 3 minutes in a 
mini Beadbeater (Biospec) on "fast" speed. The extract was filtered through a 0.2 um filter into 
an autosampler tube. The filtered extracts were then used in HPLC analysis described below. 
Leaf extracts were prepared by mixing 30-50 mg of leaf tissue with 1 g microbeads and 
15 freezing in liquid nitrogen until extraction. For extraction, 500 ul \% pyrogallol in ethanol was 
added to the leafTbead mixture and shaken for I minute on a Beadbeater (Biospec) on '^fast" 
speed. The resulting mixture was centrifiiged for 4 minutes at 14,000 rpm and filtered as 
described above prior to HPLC analysis. 

HPLC was performed on a Zorbax silica HPLC column (4.6 mm X 250 mm) with a 
20 fluorescent detection, an excitation at 290 run, an emission at 336 nm, and bandpass and slits. 
Solvent A was hexane and solvent B was methyl-t-butyl ether. The injection volume was 20 ul, 
the flow rate was L5 ml/min, the run time was 12 min (40*'C) using tlie gradient (Table 6): 

Tabic 6: 

25 Time Solvent A Solvent B 

Omin. 90% 10% 

10 min. 90% 10% 

1 1 min. 25% 75% 

12 min. 90% 10% 
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Tocopherol standards in 1% pyrogallol/ clhanol were also run for comparison (alpha 
locophcrol, gamma tocopherol, beta tocopherol, delta tocopherol, and tocophcml (locol) {all fmm 
Malrcya). 

5 Standard curves for alpha, bela, delta, and gamma tocopherol were calculated using . 

Chemstalion software. The absolute amount oFcomponent x is: Absolute amount of x= 
Responsex x RFx x dilution factor where ResponsCx is the area of peak x, RFx is the response 
factor for component x (Amountx/ResponsCx) and the dilution factor is 500 ul. The ng/mg tissue 
is found by; total ng component/mg plant tissue. 

10 Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines containing 

pMONl0822 for the expression of ATPT2 from the napin promoter are provided in Figure 24. 

HPLC analysis results of segregating T2 Arabidopsis seed tissue expressing the ATPT2 
sequence from the napin promoter (pCGN 1 0822) demonstrates an increased level of tocopherols 
in the seed. Total tocopherol levels are increased as much as 50% over the total tocopherol 

1 5 - levels of non-transformed (wild-type) Arabidopsis plants (Figure 25). Homozygous progeny 
from the top 3 lines (T3 seed) have up to a two-fold (100%) increase in total tocopherol levels 
over control Arabidopsis seed ( Figure 26.) 

Furthermore, increases of particular tocopherols are also increased in transgenic 
Arabidopsis plants expressing the ATPT2 nucleic acid sequence from the napin promoter. 

20 Levels of delta tocopherol in these lines are increased greater than 3 fold over the delta 

tocopherol levels obtained from the seeds of wild type Arabidopsis lines. Levels of gamma 
tocopherol in transgenic Arabidopsis lines expressing the ATPT2 nucleic acid sequence are 
increased as much as ebout 60% over the levels obtained in the seeds of non-transgenic control 
lines. Furthermore, levels of alpha tocopherol are increased as much as 3 fold over those 

25 obtained from non-transgenic control lines. 

Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines containing 
pCGN 10803 for the expression of ATPT2 from the enhanced 35S promoter (anlisense 
orientation ) are provided in Figure 25. Two lines were identified that have reduced total 
tocopherols, up to a ten-fold decrease observed in T3 seed compared to control Arabidopsis 

30 (Figure 27.) 
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5B. Canola 

Brassica napus, variely SP3002I, was Iransformcd with pCGN 10822 (napin-ATPT2- 
napin 3\ sense orientation) using A^rohaclerhun /;////eA'/c/e/7.y-medialcd transformation. Flowers 
5 of the RO plants were tagged upon pollination and developing seed was collected at 35 and 45 
days after pollination (DAP). 

Developing seed was assayed for tocopherol levels, as described above for Arabidopsis, 
Line 10822-1 shows a 20% increase of total tocopherols, compared to the wild-type control, at 45 
DAP. Figure 28 shows total tocopherol levels measured in developing canola seed. 

10 

Example 6: Sequences to Tocopherol Cyclase 
6A. Preparation of the sir 1737 Knockout 

The Synechocystis sp. 6803 slrl737 knockout was constructed by the following method. 
The GPS™-1 Genome Priming System (New England Biolabs) was used to insert, by a Tn7 

15 • Transposase system, a Kanamycin resistance cassette into .y/r/ 737. A plasmid from a 

Synechocystis genomic library clone containing 652 base pairs of the targeted ovf (Synechcocysiis 
genome base pairs 1324051 -1324703; the predicted orf base pairs 1 323672 - 1324763, as 
annotated by Cyanobase) was used as target DNA. The reaction was performed according to the 
manufacturers protocol. The reaction mixture was then transformed into K coli DHIOB 

2 0 electrocompetant cells and plated. Colonies from this transformation were then screened for 
transposon insertions into the target sequence by amplifying with Ml 3 Forward and Reverse 
Universal primers, yielding a product of 652 base pairs plus --1700 base pairs, the size of the 
transposon kanamycin cassette, for a total fragment size of --2300 base pairs. After this 
determination, it was then necessary to determine the approximate location of the insertion 

2 5 within the targeted orf, as 100 base pairs of orf sequence was estimated as necessary for efficient 
homologous recombination in Synechocystis, This was accomplished through amplification 
reactions using either of the primers to the ends of the transposon. Primer S (5* end) or N (3* 
end), in combination with either a Ml 3 Forward or Reverse primer. That is, four different primer 
combinations were used to map each potential knockout construct: Primer S - Ml 3 Forward, 

30 Primer S - M13 Reverse, Primer N - M13 Forward, Primer N - M13 Reverse, The construct 



45 



wo 02/33060 



PCT/USOl/42673 



used lo transfonn iSynL't-lwcyxfis and knockout slrl737 was delcmiined lo consist of a 
approximately 150 base pairs orslrl737 sequence on the 5' side ofthe transposon insertion and 
approximately 500 base pairs on the 3' side, with the transcription of the orf and kanamycin 
cassette in the same direction. The nucleic acid sequence orsIrl737 is provided in SEQ ID 
5 . NO:38 the deduced amino acid sequence is provided in SEQ ID NO:39. 

Cells of Synechocysiis 6803 were grown to a density of-- 2x10* cells per ml and 
harvested by centrifugation. The cell pellet was re-suspended in fresh BG-1 1 medium at a 
density of 1x10? cells per ml and used immediately for transformation. 100 ul of these cells were 
mixed with 5 ul of mini prep DNA and incubated with light at 30C for 4 hours. This mixture 

10 was then plated onto nylon filters resting on BG-1 1 agar supplemented with TBS ph8 and 
allowed to grow for 12-1 8 hours. The filters were then transferred to BG-1 1 agar + TBS + 
5ug/ml kanamycin and allowed to grow until colonies appeared within 7-10 days (Packer and 
Glazer, 1988). Colonies were then picked into BG-1 1 liquid media containing 5 ug/ml 
kanamycin and allowed to grow for 5 days. These cells were then transferred to Bg-1 1 media 

15 - containing lOug/ml kanamycin and allowed to grow for 5 days and then transferred to Bg-1 1 + 
kanamycin at 25ug/ml and allowed to grow for 5 days. Cells were then harvested for PGR 
analysis to determine the presence of a disrupted ORP and also for HPLC analysis to determine if 
the disruption had any effect on tocopherol levels, 

PGR analysis of the Synechocysiis isolates, using primers to the ends of the sir J 737 orf, 

20 showed complete segregation of the mutant genome, meaning no copies of the wild type genome 
could be detected in these strains. This suggests that function ofthe native gene is not essential 
for cell function. HPLC analysis of the strain carrying the knockout for .y/r7 73 7 produced no 
detectable levels of tocopherol. 

f 

25 6B. The relation of slrl737 and slrl736 

The slrl737 gene occurs in Synechocysiis downstream and in the same orientation as 
sir 1736, the phytyl prenyltransferase. In bacteria this proximity often indicates an operon 
structure and therefore an expression pattern that is linked in all genes belonging to this operon. 
Occasionally such operons contain several genes that are required to constitute one enzyme. To 

30 confirm that slrl737 is not required for phytyl prenyltransferase activity, phytyl prenyltransferase 
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was measured in extracts From the Syneclion'stis slrl737 knockout mutant. Figure 29 shows that 
extracts from the Synechotysiis slrl737 knockout mutant still contain phytyl prenyl Iransrcrasc 
activity. The molecuhir organization of genes in Syncchocysfix 6803 is shown in A. Figures B 
and C show HPLC traces (nonnal phase HPLC) ol' reaction products obtained with membrane 
5 preparations from Synechocystis wild type and sir 1737" membrane preparations, respectively. 

The fact that sir 1737 is not required for the PPT activity provides additional data that 
ATPT2 and slrl 736 encode phytyl prenyltransferases. 

6C Synechocystis Knockouts 

10 Synechocystis 6803 wild type and Synechocystis slrl737 knockout mutant were grown 

photoautotrophically. Cells from a 20 ml culture of the late logarithmic growth phase were 
harvested and extracted with ethanol. Extracts were separated by isocratic normal-phase HPLC 
using a Hexane/Methyl-t-butyl ether (95/5) and a Zorbax silica column, 4.6 x 250 mm. 
Tocopherols and tocopherol intemiediates were detected by fluorescence (excitement 290 nm, 

15 - emission 336 nm) (Figure 30). 

Extracts o( Synechocystis 6803 contained a clear signal of alpha-tocopherol. 2,3- 
DimethyI-5-phytylplastoquinoI was below the limit of detection in extracts from the 
Synechocystis wild type (C). In contrast, extracts from the Synechocystis sXvlTil knockout 
mutant did not contain alpha-tocopherol, but contained 2,3-dimethyl-5-phytylplastoquinol (D), 

20 indicating that the intennjption of shl737 has resulted in a block of the 2,3-dimethyl-5- 
phytylplastoquinol cyclase reaction. 

Chromatograms of standard compounds alpha, beta, gamma, delta-tocopherol and 2,3- 
dimethyl-5-phytylplastoquinol are shown in A and B. Chromatograms of extracts form 
Synechocystis wild type and the Synechocystis slrl 737 knockout mutant are shown in C and D, 

2 5 respectively. Abbreviations: 2,3-DMPQ, 2,3-dimethyl-5-phytylplastoquinol. 

6D. Incubation with Lysozyme treated Synechocystis 

Synechocystis 6803 wild type and slrl 737 knockout mutant cells from the late logarithmic 
growth phase (approximately Ig wet cells per experiment in a total volume of 3 ml) were treated 
30 with Lysozyme and subsequently incubated with S-adenosylmethionine, and 
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phyty I pyrophosphate, plus radiolabelled homogenlisic acid. Afler I7h incubation in the dark at 
room temperature Ihc samples were extracted with 6 ml chloroform / melhanol (1/2 v/v). Phase 
separation was obtained by the addition of 6 ml 0.9% NaCI solution. This procedure was 
repeated three times. Under these conditions 2,3-dimelhyl-5'phytylpIastoquinol is oxidized to 
5 rorm2,3-dimethyI-5-phytyiplastoquinone. 

The extracts were analyzed by normal phase and reverse phase HPLC. Using extracts 
from wild type Synechocyslis cells radiolabelled gamma-tocopherol and traces of radiolabelled 
2,3-dimethyl-5-phytylplastoquinone were detected. When extracts from the slrl737 knockout 
mutant were analyzed, only radiolabelled 2,3-dimethyl-5-phytylplastoquinone was detectable. 

10 The amount of 2,3-dimethyl-5-phytylplastoquinone was significantly increased compared to wild 
type extracts. Heat treated samples of the wild type and the sir 1737 knockout mutant did not 
produce radiolabelled 2,3-dimethyl-5-phytylplastoquinone, nor radiolabelled tocopherols. These 
results further support the role of the slrl 737 expression product in the cyclization of 2,3- 
dimethyl-5-phytylplastoquinoL 

15 ' : . 

6E. Arabidopsis Homologue to slrl 737 

An Arabidopsis homologue to slrl 737 was identified from a BLASTALL search using 
Synechocyslis sp 6803 gene slrl 737 as the query, in both public and proprietary databases. !SEQ 
ID NO: 109 and SEQ ID N0:1 10 are the DNA and translated amino acid sequences, respectively, 
20 of tht Arabidopsis homologue to slrl737. The start if found at the ATG at base 56 in SEQ ID 
NO:109. 

The sequences obtained for the homologue from the proprietary database differs fiom the 
public database (F4D1 1 .30, BAG AL022537), in having a start site 471 base pairs upstream of 
the start identified in the public sequence. A comparison of the public and proprietary sequences 
25 is provided in Figure 31. The correct start correlates within the public database sequence is at 
12080, while the public sequence start is given as being at 1 1609. 

Attempts to amplify a slrl737 homologue were unsuccessful using primers designed from 
the public database, while amplification of the gene was accomplished with primers obtained 
from SEQ ID NO: 109. 
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Analysis of (he prolcin sequence lo identify Inmsit peptide sequence predicted twb 
potential cleavage sites, one between amino acids 48 and 49, and the other between amino acids 
98 and 99. 

5 6F. slrl 737 Protein Information 

The slrl 737 orf comprises 363 amino acid residues and has a predicted MW of 41kDa 
(SEQ ID NO: 39), Hydropathic analysis indicates the protein is hydrophillic (Figure 32). 

The Arabidopsis homologue to slrl 737 (SEQ ID xx) comprises 488 amino acid residues, 
has a predicted MW of 55kDa, and a has a putative transit peptide sequence comprising the first 

10 98 amino acids. The predicted MW of the mature form of the Arabidopsis homologue is 44kDa. 
The hydropathic plot for the Arabidopsis homologue also reveals that it is hydrophillic (Figure 
33). Further blast analysis of the Arabidopsis hortiologue reveals limited sequence identity (25 % 
sequence identity) with the beta-subunit of respiratory nitrate reductase. Based on the sequence 
identity to nitrate reductase, it suggests the slrl 737 orf is an enzyme that likely involves general 

15 - acid catalysis mechanism. 

Investigation of known enzymes involved in tocopherol metabolism indicated that the 
best candidate corresponding to the general acid mechanism is the tocopherol cyclase. There are 
many known examples of cyclases including, tocopherol cyclase, chalcone isomerase, lycopene 
cyclase, and aristolochene synthase. By further examination of the microscopic catalytic 

20 mechanism of phytoplastoquinol cyclization, as an example, chalcone isomerase has a catalytic 
mechanism most similar lo tocopherol cyclase. (Figure 34), 

Multiple sequence alignment was performed between slrl 737, slrl 737 Arabidopsis 
homologue and iht Arabidopsis chalcone isomerase (Genbank:P41088) (Figure 35), 65% of the 
conserved residues among the three enzymes are strictly conserved within the known chalcone 

25 isomerascs. The crystal structure of alfalfa chalcone isomerase has been solved (Jez, Joseph M., 
Bowman, Marianne E., Dixon, Richard A., and Noel, Joseph P. (2000) "Structure and 
mechanism of the evolutionarily unique plant enzyme chalcone isomerase''. Nature Structural 
Biology 7: 786-791.) It has been demonstrated tyrosine (Y) 106 of the alfalfa chalcone 
isoriierasc serves as the general acid during cyclization reaction (Genbank: P28012). The 
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equivalent residue in slrl 737 and the sir 1 737 Arabklopsis homolog is lysine (K), which is an 
excellent catalytic residue as general acid. 

'ITie information available Irom partial purillcation of tocopherol cyclase from Chlorella 
prolothecohks (U.S. Patent No. 5,432,069), /.e., described as being glycine rich, water soluble 
5 and with a predicted MW of 48-50kDa, is consistent with the protein informatics information 
obtained for the sir 1737 and the Arabidopsis slrl 737 homologue. 

All publications and patent applications mentioned in this specification are indicative of 
the level of skill of those skilled in the art to which this invention pertains. All publications and 
10 patent applications are herein incorporated by reference to the same extent as if each individual 
publication or patent application was specifically and individually indicated to be incorporated by 
reference. 

Although the foregoing invention has been described in some detail by way of illustration 
and example for purposes of clarity of understanding, it will be obvious that certain changes and 
15 - rpodifications may be practiced within the scope of the appended claim. 
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CLAIMS 

What is claimed is: 

I . An isolated nucleic acid sequence encoding a prenyl transferase. 

5 . 2. An isolated nucleic acid sequence according to Claim 1, wherein said prenyl transferase is 

selected from the group consisting of straight chain prenyllransferase and aromatic prenyllransferase. 

3. An isolated DNA sequence according to Claim 1, wherein said nucleic acid sequence is. 
isolated from a eukaryotic cell source. 

'4. An isolated DNA sequence according to Claim 3, wherein said eukaryotic cell source is 
10 selected from the group consisting of mammalian, nematode, fungal, and plant cells. 

5. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 
Arabidopsis, 

6. The DNA encoding sequence of Claim 5 wherein said prenyltransferase protein is encoded by 
a sequence selected from the group consisting of SEQ ID N0:1 , SEQ ID N0:3, SEQ ID NO:5, SEQ ID 

15 - N0:7, SEQ ID N0:8, SEQ IDN0:9, SEQ ID NO: 10, SEQ ID N0:1 1, SEQ ID NO: 13, SEQ ID N0:14, 
SEQ ID NO: 1 5, and SEQ ID NO: 1 6. 

7. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 
soybean. 

8. The DNA encoding sequence of Claini 7 wherein said prenyltransferase protein is encoded by 
20 a sequence comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 1 9, SEQ 

ID NO:20, SEQ ID N0:21, SEQ ID NO:22, and SEQ ID NO:23. 

9. The DNA encoding sequence of Claim 7 wherein said prenyltransferase protein is encoded by 
a sequence selected from the group consisting of SEQ ID NO:95, and SEQ ID NO:96. 

10. The DNA encoding sequence of Claim 7 wherein said prenyltransferase protein has an 
2 5 amino acid sequence selected from the group consisting of SEQ ID N0:?7, and SEQ ID NO:98. 

I I . The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from com. 
12. The DNA encoding sequence of Claim 1 1 wherein said prenyltransferase protein is encoded 

by a sequence comprising a nucleotide sequence selected from the group consisting of SEQ ID NO:25, 
SEQ ID NO:26, SEQ IDNO;27, SEQ ID NO:28, SEQ ID NO:29, SEQ IDN0:31, SEQ ID NO: 104, 
30 SEQ ID NO: 105, and SEQ ID NO: 106. 
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13. The DNA encoding sequence of Claim 4 wherein said prenyllransferase protein is from rice. 

14. The DNA encoding sequence of Claim 13 wherein said prenyllransferase protein is encoded 
by a sequence comprising SEQ ID NO:99. 

1 5. The DNA encoding sequence of Claim 4 wherein said prcnyltransfcrase protein is from 

wheal. 

16. The DNA encoding sequence of Claim 15 wherein said prenyllransferase protein is encoded 
by a sequence comprising SEQ ID NO: 100. 

17. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from leek. 

1 8. The DNA encoding sequence of Claim 1 7 wherein said prenyllransferase protein is encoded 
by a sequence comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 101, 
and SEQ ID NO: 102. 

19. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 

canola. 

20. The DNA encoding sequence of Claim 19 wherein said prenyltransferase protein is encoded 
- hy.a sequence comprising SEQ ID NO: 103. 

2 1 . The DNA encoding sequence of Claim 4 \yherein said prenyltransferase protein is 6om 

cotton. 

22. The DNA encoding sequence of Claim 21 wherein said prenyltransferase protein is encoded 
by a sequence comprising SEQ ID NO: 107. 

23. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 

tomato. 

24. The DNA encoding sequence of Claim 23 wdierein said prenyltransferase protein is encoded 
by a sequence comprising SEQ ID NO: 108. 

25. An isolated DNA sequence according to Claim 4, wherein said prokaiyotic source is a 
Syncchocysi is sp. 

26. A nucleic acid construct comprising as operably linked components, a transcriptional 
initiation region functional in a host cell, a nucleic acid sequence encoding a prenyltransferase, and a 
transcriptional lerminalion region. 
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27. A niicleic acid construct according lo Claim 26. wherein said nucleic acid sequence encoding 
prcnyltransfcrasc is obtained from an organism selected from the group consisting of a eukaryotic 
organism and a prokaryotic organism.' 

28. A nucleic acid construct according to Claim 27, wherein said nucleic acid sequence encoding 
prenyltransfcrase is obtained from a plant source. 

29. A nucleic acid construct according to Claim 28, wherein said nucleic acid sequence encoding 
prenyltransfcrase is obtained from a source selected from the group consisting of Arabidopsis. soybean, 
com, rice, wheat, leek canola, , leek, cotton, and tomato. 

30. A nucleic acid construct according to Claim 26. wherein said nucleic acid sequence encoding 
prenyl transferase is obtained from a Synechocyslis sp. 

31. A plant cell comprising the construct of 26. 

32. A plant comprising a cell of Claim 3 1 . 

33 A feed composition produced from a plant according to Claim 32. 
34. A seed comprising a cell of Claim 31. 
35 Oil obtained from a seed of Claim 34. 

36. A natural tocopherol rich refined and deodorised oil which has been produced by a 
method of treating an oil according to Claim 35 by distilling under low pressure and high 
temperature, wherein said refined oil has reduced fi^e fatty acids and a substantial percentage of 
tocopherol present in the pretreated oil. 

37. A refined oil according to claim 36, wherein the pretreated oil is crude or pre-treated 
soybean oil. 

38. A refined oil according to claim 36, wherein the refined oil is degummed and 
bleached. 

40. A method for the alteration of the Isoprenoid content in a host cell, said method comprising; 
transforming said host cell with a construct comprising as operably linked components, a transcriptional 
initiation region functional in a host cell, a nucleic acid sequence encoding prenyltransfcrase, and a*: 
transcriptional termination region, 

wherein said isoprenoid compound selected from the group of tocopherols and tocotrienols . 

41 . The method according to Claim 40, wherein said host cell is selected fiom the group 
consisting of a prokaryotic cell and a eukaryotic cell. 
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42. The mclhod according lo Claim 4 K wherein said prokaryolic cell is a Synechocysti^^ sp. 

43. The mclhod according lo Claim 4 1 , wherein !>aid cukaryolic cell is a plant cell. 

44. rhe mclhod accordini^ lo Claim 43, wherein said planl cell is oblaincd from a plant selected 
from Ihc group consisting of Arahithpslw soybean, com, rice, wheal, leek canola, , leek, colton, and 

5 . lomalo. 

45. A method for producing an isoprcnoid compound of interest in a host cell, said method 
comprising obtaining a transformed host cell, said host cell having and expressing in its genome: 

a construct having a DNA sequence encoding a prenyltransferase operably linked to a 
transcriptional initiation region functional in a host cell, 
LO wherein said prenyl bransferase is involved in the synthesis of tocopherols, 

and wherein said isoprenoid compound selected from the group of tocopherols and tocotrienols. 

46. The method according to Claim 45, wherein said host cell is selected bom the group 
consisting of a prokaryotic cell and a eukaryotic cell. 

47. The method according to Claim 46, wherein said prokaryotic cell is a Synechocysiis sp. 
15 . . 48. The method according to Claim 46, wherein said eukaryotic cell is a plant cell. 

49. The method according to Claim 48, wherein said plant cell is obtained from a plant selected 
from the group consisting wherein said compound selected from the group of Arabidopsis, soybean, 
com, rice, wheat, leek canola, , leek, cotton, and tomato. 

50. A method for increasing the biosynthetic flux in a host cell toward production of an 
20 isoprenoid compound, said method comprising; 

transforming said host cell with a construct comprising as operably linked components, a 
transcriptional initiation region functional in a host cell, a DNA encoding a prenyltransferase, 
and a transcriptional termination region, 

wherein said isoprenoid compound selected from the group of tocopherols and 
25 tocotrienols,. 

5 1 . The method according to Claim 50, wherein said host cell is selected from the group 
consisting of a prokaryotic cell and a eukaryotic cell. 

52. The method according to Claim 51 , wherein said prokaryotic cell is a Synechocysiis sp. 

53. The method according to Claim 51, wherein said eukaiyotic cell is a plant cell. 
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54. The method according to Claim 50, wherein snid plant cell is obtained from a plant selected 
from the group consisting ArahichpsLw soybean, com, rice, wheat, Icck canola, , leek, cotton, and 
tomato. 

55. Tlic method according to Claim 50, wherein said transcriptional initiation region is a seed- 
5 specific promoter. 
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Query Sequence: F4D11 , AL022537 

Database: PIR T04448.atcea.listJdSta 

Database: PIR"t044« 
Plus (4) denotes forward strand, and minus (-) reverse strand. 
Asterisks (M denote bases not shown on pair wise aliqnmnts. 

Alignment 1 



Query- 1 21 94 CACACGnCTCGTCCmTCTTCTTCCTCTCTGCATTCTTCACAGAGTTTGTCACCAC^ 
genomic ' * 

ATCEA4C371+ I C est 

MET : . : . : ; : : • :first 

Query- 12134 ^G^iM&^^.^6^?M3S&^BSc® 

illillllil-liliiiiiiiiiiiiiiiiiiMilliiiiiiiiliiiiilliliiiii 

ATCEA4C371+ 2 ACCCCAAACATCACAATTTCACATTCTTTTCX;ATATnCTTCTTCTTCTTCCATTATGGA 



Query- 12075 GATACGGAC^TTGAnGmCTATGAACCCTAATTTATCTTCCmGACXITCTCT^ 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiMiiiniii 

ATCEA4C37U 62 GATACGGAGCTTGATTGmCTATGAACCCTAAmATCTTCCTTTGAGCTCTCTCGCCC 



Query- 12015 TGTATCTCCTCTCACTCGCTCACTAGTTCCGTTCCGATCGACTAAACTAGnCCCCGCTC 

llllllilllllllllllllllltllllllllllllllllllllllllllllllllll-ll 
ATCEA4C371+ 122 TGTATCTCCTCTCACTCGCTCACTAGTTCCGnCCGATCGACTAAACTAGTTCCCCGCTC 



Query- 11955 CATTTCTAGGGTTTCCSMffiATCTCCACCCCGAATAGTGAAACTGACAAGATCTCCGT 

iiiiiiiiiiiiiinniiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiDiiii 

ATCEA4C37 1 + 182 CATTTCTAGGGTTTCGGCGTCGAtCTCCACCCCGAATAGTGAAACTGACAAGA*rCTCCGT 



Query- 11895 TAAACCTGnTACGTCCCGACGTaCCCAATCGCGAACTCCGGACTp?!^e^I(§GTA 

lllilllllllllllllll|lllflllllllllllllllllllllllllllllllll->-. 
ATCEA4C371+ 242 TAMCCTGTTTACGTCCCGACGTCTCCCAATCGCGAACTCCGGACTg ggllA ^ 

Synecho seq aligns from 

here 



Query- 11835 AAnGATCCAnCCATTCCATTTCTCTTCTCTTGTTTGnmmAGCTCCAATTTCAG 



ATCEA4C371+ 299 



FIG. 31 
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— 60 bp removed 

Query- 11715 t**i**^*****'****><^***<i**^*i**'7TTG 

ATCEA^CSTH 299 

PIR:T04448 1 • 

Query- 11655 GTGGCTCACCATTCGACGACTACTTnGAATTOAGTTTTTGAAmTaAAmAACAT 

A'rCEA4C371+ 299 



PIR:T04448 1 M Q F N I 

arab sequence which is incorrect 



Query- ■ 11595 CAGAGAGTnTTTTTTTTATGGTTGATAACTTATTGTTTAACTTTTGAAAAATGCA(Sffl 
III 

ATCGA4C37U 299 W 
PIR:T04448 6REFFFLWLITYCLTFEKC'-RY 



Query- 11535 CCAnTCGATGGAACACCTOKAAGTTCTTCGAGGGATGGTAraCAGGGTTTCCATCCC 

iiiiiiiiiiiiiiumiii.inimiiHiiriiinnii ninii mnm 

ATCEA4C37 1 + 302 CCAmCGATGGAACACCTCGGAAGTTCTTCGAGGGATGGTATTT(^EiaiiTCaTCCC 
PIR:T04448 26 H F D G *T P R K'F FEGWYFEiHSIP 



Query- 11475 AGAGAAGAGGGAGAGTITTTGTTTTATGTATTCTGTGGAGAATCCTGCATTTCGGCAGAG 

IIIIIIIIMIIIilllllllllllllllllllllllllllllllllllllllHIIIil 
ATCEA4C371+ 362 AGAGAAGAGGGAGAGTTTTTGTnTATGTATTCTGTGGAGAATCCTGCATTTCGGCAGAG 

PIR:T04448 46 EKReSFCFHYSVENPAFRQS 



Query- 11415 TTTGTCACCATTGGAAGTGGCTCTATATGGACaAGAnCACTGGTGTTGGAGCTCAGAT 

lllllltlillllllltlllllllllllillltllllllllllllllllllilllillil 
ATCEA4C371+ 422 TTTGTCACCATTGGAAGTGGCTCTATATGGACCTAGATTCAaGGTGTTGGAGCTCAGAT 

PIR:T04448 66 LSPISVAL. YGPRFTGV6AQI 

FIG. 31 (CONT-1) 
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Query- 11355 TCTTOXIGCTMTGATAMTAmATGCCAATACGMCMGACTCTaCAATTTCTaK^ 

IIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIII. 

ATCEA4C371+ 482 TaTGGCGCTAATGATAAATATTTATGCCAATACGAACAAGACTCTCACAATTTC ' 



PIR:T04448 86 LGANDKYLCQYE0DSHNFWG 
ATCEA4C371f Exon 11538 11301 Confidence: 100 100 



Query- 11295 ftGGTAACTCCTTGACCCnAAAATGaGTGTCATGACAATAAGAAATCATATCTGAGTCT 

ATCEA4C371+ 537 " ' 

PIR:T04448 106 D 

PIR:T04448 Exon 11609 11294 Confidence: 100 100 

Query- 11235 TTTCTCTACTTCTAGTACTAATGTTCGTOTTGTTCTTAAAGATCTAAGTCTTATCIGAA 

PIR:T04448 107 

Query- 11175 TnTGTTACATTnGGnCTGGTGCnTCTCAACATGAAmGTATATATGACT^^ 

PIR:T04448 107 

Query- 11115 ATTGCTTAa:TAAAGTTmACTCATGCATAGATCGACATGAGCTAGTTTTGGGGAATAC 

PIR:T04448 107 ""r"'h""e"l" v'"l' G**n"t' 

Query- 11055 nmGTGaGTGCCAGGCGCAAAGGCTCCAAACMGGAGGTTCCACCAGAGGT^ 

PIR:T04448 116 F S A / p' V'a 'k A"'p*'N'*K''r*V*'p''p''E"* 

PIR:T04448 Exon 11083 11004 Confidence: 96 100 

Query- 10995 TCCTCCCWTTGGTTACnTGTTATCTGTTAAATAGTTTTCCAATTGTATCCGGATAGT 

PIR:T04448 133 

Query- 10935 GTTCTACTTCTCCnGTAGAAAATCTCAAGmT'KrrTACTCnGCTATTC^ 

PIR:T04448. 133 " . -.— — 

FIG. 31 (CONT-2) 
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PIR:T04448 

Query- 

P1R:T04448 
PIR:T04448 

Query- 
PIR:T04448 

Query- 
PIR:T04448 

Query- 
PIR:T04448 

Query- 
PIR:T04448 

Query- 

PIR:T04448 
PIR:T04448 

Query- 
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10875 TTGATTTGTAMGCATGTCGTTTTATTGTAGGAATnPACAGAAGAGTCrrC 
133 E F N R R V S E G F 

10815 CCAAGCTACTCCATTTTGGCATC/^GGTCACATTTGCGATGATGGCCGGTAATTATATGA 

143 q \ T p\f' W H Q G H I C 0 0 G R 
Exon 10844 10768 Confidence: 100 100 

10755 TTCTATGCACAACAAGAATTCACTATATTATAAATATTGGATATTGAGTAnTTTGnGA 
159 

1 0695 AAATTTCTcknTAAATCTGACTTGACTIGTrnGTCAGTACTGACTATGCGGAAACTG 
159 *T 'd Y a E T V 

10635 TGAAATCTGCTCGnGGGAGTATAGTACTCGTCCCGTTTACGGTTGGGGTGATGTTGGGG 
• *•*•••*** »•»»#♦•*••••••»•••#«•»••• 

166 KSARWEYSTRPVYGWGDVGA 

1 0575 CCMACAGMGTCAACTGCAGGCTGGCCTGCAGCTTTTCCTGTATTTGAGCCTCATTGGC 
186 K Q K S T A 6 H P A A F P V F E P H « Q 

1 0515 AGATATGCATGGCAGGAGGCCTTTCCACAGGTGTGAGCTTTGCTTGATTGACTTAAAGTT 

206 i C M A G G L S T G 

Exon 10655 10486 Confidence: 96 100 

10455 AATAAATAGACGGnAAGTnACTTGCCTAGTACTAACAGAAAATTAAGAAAGAAACCAC 
216 

FIG. 31 (CONT-3) 
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Query- 10395 CCTCmCTkAGCAGf^CTKTOTGTAGTrCmTmiTCT^ . 

PIR:T04448 216 

Query- 10335 GTGGATAGAATGGGGCGGTGAMGGmGAGTTTCGGGATGCACCTTCTO 

PIR;T04448 216 'w'"i"'E'"H"c-"'G"'E R F E F R 0 A P S Y S E K 

Query- 10215 GAATTOGGTGGAGOTCCCAAGAAAATGGmTGGGTW^^ 

PIR:T04448 236 'il"B"G"G F P R K w i W ^ 

PIR:T04448 Exon 10336 10239 Confidence: 96 100 

Query- 10215 ACATTTCTTGnGCAGACTfrAGTTAGCTAGTGGACCTGTGTATACACCCACATAT " 

PIR:T04448 248 

Query- 10155 TACTTGTrTGATAGaTTATnGTCAATGTCTCTTTACAOTCCAGTCT^ 

PIR:T04448 248 V Q C N V F E 

Query- 10095 AGGGrcAACTGGAGWGTTOTTTMCCGCAGGTffiCGGGnGAG^^ 

PIR:T04448 255 'g"a"t"g"e'"v"'a "i't A G G G L R Q L P G L 

Query- 10035 GACTGAGACCTATGAAAATaTGCMTCGTATGCACTTATAAGATCTTOT^ 

PIR:T04448 275 't"e "t Y B N A A L 

P1R:T04448 Exon 10115 10008 Confidence: 100 100 

Query- 9915 CAGTGAGTATTAGAAGGCAGATAGTTTACAAAAGCTaGGGCCCTTGTAAATCTGCAGCT 

PIR:T04448 284 ^ 

Query- 9915 nGTGTACACTATGATGGAWAATCTACGAGTTTGTTCCTTGGAATGGTGTTGTTAGATC 

P1R:T04448 285 ''z 'v' T To 'g' ^^^^^r"' ^'^L^^ ,1, 

GSDB:S:495- 532 



FIG. 31 (CONT-4) 
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Query- 

PIR:T04448 

6S0B:S:495- 

PIR:T04448 

GSDB:S:495- 

Query- 

PIR:T04448- 

6SDB:S:495- 

Quety- 

PIR:T04448 

GS0B:S:495- 

Query- 

PIR:T04448 

GSDB:S:495- 

Query- 

PIR:T04448 

6SDB:S:495- 

Query- 

PIR:T04448 

GSDB:S:495- 

PIR:T04448 

GS0B:S:495- 



9855 GGAMTGTCTCCCTGGGG TTATTGGTATATAACTGCAGAGAACGAAAACCATGTGGTAA 

305 E M S P h"g* YH'y'T'f'A' •e'"n"e**n"h'v"" 
llllll-lllllllllll-lllllllllllllllllllllll II lllllllili— . 

bZ6 ggaaat tctccctgggggttattggtatataactgcagagaNcgNaaaccatgtq 
Exon 9917 9801 Confidence: 100 100 " . 

Exon 9961 9801 Confidence: 93 93 

9796 ATTOTmACTAGTTTCATTCAGTTTTACTmGACATCATATCATTCCCnAm^^ " 
323 ' " " 

471 ""■ 

9736 GATTCCAACACCCGATGAATGTCTTGTGACAGGTaAACMGAGGCAAGMCAAATGM^ 

V B L B A R "t "n 'e 'a 

— --. Illlllllllllll lllilMIMlll 

• gtggaactagaggcNagaacaaatgaag 

9676 cgggtacacctctgcgtgctcctaccacagaagttgggctagctacggcttgcagagata • 

333 6 T P L R a" 'p"t" 't"e"'v"g" 'l"a* "t" A* 'c'r* "o'l 
IIMIIIIIIIIIIIIIIIIIIIIIIIIIIIII1IIIIIIIIIIIIIIIIIIIIIIIIII 
443 cgggtacacclctgcgtgctcctaccacagaagttgggctagctacggcttgcagagata 

9616 gttgttacggtgaatoaagttgcagatatgggaacggctatatcatggaact^ 

353 C Y 6 E L K L V I W*'e' 'r''l' 'y' 'd' 'g' 's* 'k* "g* 'k 
I mmilllllllllllllllllllllllllllMllllllllllllilllllllll 
383 gttgttacggtgaattgaagttgcagatatgggaacggctatatgatggaagtaaaggca 

9556 AGGTATGTATGCTAATGTGATCCAATCCCrcTAGTTAAAAGTCTTAACAMTCCTAAGGC 

II _ i. K v'"l"t"n"p*"k"a* 

323 ag " 
Exon 9704 9555 Confidence: 100 100 
Exon 9704 9555 Confidence: 98 100 

FIG. 31 (CONT-5) 
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Query- 

PIR:T04448 

GSOB:S;495- 

Query- 

PIR:T04448 

GSDB:S:495- 

Query- 

PIR:T04448 

GSDB:S:495- 



Query- 
(stopl 

PIR:T04448 

GSDB:S:495- 
PIR:T044'48 



Query- 

PIR:T04448 

GSDB:S:495- 



Query- 



GSDB:^:495- 



Query- 

GSDB:S:495- 
GSDB:S:495- 
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9496 AGTGAMGAAGAnATGMCGmGTTATGGTTAACAATGATGCAGGTGATATTAGAGAC 
382 VKEOYERLLWLT11MQV1LET 

- — - - — -immiiiiiii 

321 gtgatattdqaqac 

9436 AAAGAGCTCAATGGCAGCAGTGGAGATAGGAGGAGGACCGTGGTTTGGGACATGGAAAGG 

402 K S S 'm' hh \ EIGGGPWFGTWKG 
llllllllllllllll lllllllllllllllllllllillDllinilllllllllll 
307 aaagagctcaatggcaNcagtggagataggaggaggaccgtggtttgggacatggaaagg 

9376 AGATACGAGCAACACGCCCGAGCTACTAAAACAGGCTCTTCAGGTCCCAtTGGATCTTGA 

422 DTSNTPELLKQALQVPLDLE 

lllllllllllllllllllllllllllllllllllirillllllllllllllllllllll 
247 agatacgagcaacacgcccgagctactaaaacaggctcttcaggtcccattggatcttga 

9316 AAGCGCCTTAGGTTTGGTCCCTTTCTTC AAGCCACCGGGTCTGTA^ 

442 SALGLVPFFKPPGL 

lllllllllllllllllllllllllllllllllllllllllllllirillilllllllll 
187 aagcgccttaggtttggtccctttcttcaagccaccgggtctgtaacattgatgagtgtt 
Exon 9522 ' 9274 Confidence: 100 100 

9256 
456 

lilllllilllllllllllllllllllllllilllllllllllllllllltlllllllll 
127 ttgtttgttqatagagacccatgtgatgaatgaagccttagtcatgtcattgctagcttc 

9196 ACTATTATGTATGTATGATTTTAGTTCGTTCGGTCCTTGTGGTAAATGATACGGGCCAGT 

lllllllllllllllllllllltlllllllllllllllllllllllllllllllllllll 
67 actattatgtatgtatgattttagttcgttcggtccttgtggtaaatgatacgggccagt 

91 36 GTAAAGTCTAGTTCAATAAAAGCCTTGAGTCGCATAAmCAATnCAAATTGCATC 

lllllll - — 

7 gtaaagt 

Exon 9450 9130 Confidence; 98 100 

FIG. 31 (CONT-6) 
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ATCEA4C37145_1 3063693/emb|CAA18584.1| 4.0e-43 (AL022537) putative protein 
(Arabidopsis'thaliana ] 

PIR:T04448 sPIR-T04448 shypothetical protein F4D11.30 - Arabidopsis thaliana; 
q3063693|einb|CAAl8584.1 {AL022531) putative protein (Arabidopsis thaliana)_F4O11.30 

GSDB:S;4955486|M995392|AI995392n01673n9 A. thaliana, Columbia Col-0, 
inflorescence-1 Arabidopsis thaliana cONA clone 701673779, niRNA sequence. 
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UJ 




a: O oc 
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slrl737_SYNSP S74814 

slrl737_ARATH'T04448~ 

CFI_ARATH_P41088_ 

slrl737_SYNSP_S74814 
slrl737_ARATH T04448~ 
CFI_ARATH_P41088_ 

slrl737_SYNSP S74814_ 
slrl737_ARATH"T04448 
CFI_ARATH_P41088_ . 

slrl737 SYNSP_S74814_ 
Slrl737JRATH T04448_ 
CFI_ARATH_P41088_ 

slrl737_SYNSP_S74814_ 
slrl737 ARATH_T04448_ 
CFI_ARAfH_P41088_ 

slrl737_SYNSP_S74814_ 
slrl737 ARATH_T04448 
CFI_ARAfH_P4108B_ 

slrn37_SYNSP_S74B14 
slrn37_ARATH T04448 
CFI_ARATH_P41088_ 

slrl737_SYNSP_S74814 
slrl737_ARATH T04448 
CFI_ARATH_P41088_ 

slrl737_SYNSP_S.74814 
slrn37 ARATH_T04448' 
CFI_ARATH_P41088_ 

slrl737_SYNSP_S74814 
slrl737 ARATH_T04448 
CFI ARATH P41088_ 
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- M 

MBIRSLIVSMNPNLSSFELSRPVSPLTRSLVPFRSTKLVPRSISRVSASI 



Kpp PHSGYHWQGQS<^PfFEG«YVRLL 

STPNSETDKISVKPVYVPTSPNRELRTPHSGYHFDGTPRKPFEGWYFRVS 



LPQSGESFAFMYSIENPASDHHYGGGAVQIL6PATK KQENQEOQLV 

IPEKRESFCFMYSVENPAFRQSLSPLEVALYGPRFTGVGAQILGANDKYL 
MSSSNACASPSPFPA VTKLHVDSV- 

WRTFPSVKKFWASPRQFALG-HWGKCRDNRQ-AKPLLSBEFFATVKEGYQ 
CQYEQDSHNFVGDRHELVLGNTFSAVPGAKAPNKBVPPBEFNRRVSEGFQ 
—TFVPSVKSPASSNPLFLG-GA6VR6LDIQ-GK FVIFTVIGVY 

IHQNQHQGQIIHGDR HCRWQFTVEPEVTWGSPNRFPRATAGW 

ATPFWHQGHICDDGRTDYAETVKSARWEYSTRPVYGWGDVGAKQKSTAGW 
LEGNAVPSLSV KWKGKTTEELTESIPFFREIVTGAF 

LSFLPLFDPGWQILLAQGRAHGWLKWQREQYEFDHALVYAEKNWGHSFPS 
PAAFPVFEPHWQICMAG6LSTGWIEWGGERFEFRDAPSYSEKNWGGGFPR 
EKFIKVT M KLPLTGQQVSEKVTENC 

RWFWLQANYFPDHPG-LSVTAAGGERIVLGRPE — EVALIGLHHQQNn 
KWFWVQCNVFEGATGEVALTAGGGLRQLPGLTETYENAALVCVHYDGKMY 
VAIWKQLGLYTDCEA-KAV-- — EKFLEIFKB — BT 

EF6PGHGTVTWQVAPWGRHQLKASNDRYWK1SGKTDKKGSLVHTP-TAQ 
EFVPWNGWRWEMSPWGYWYITAENENHWELEARTNEAGTPLRAPTTEV 
-FPPG-SSILFALSPTGSLTVAFSKDDS-IPETGIAVIENKLLAEA-VLE 

GLQLNCRDTTRGYLYLQLGSVGHG— — LIVQGETDTAGtEVGG 

GLATACRDSCYGELKLQIWERLYDGSKGSVILETKSSMAAVEIGGGPWFG 
—SIIGKNGVSPGTRLSVAERISQ LMMKNKDBKEVSOHSl 

- — DWGLTEENLSKKT — VPF 

TWKGDTSNTPELLKQALQVPLDLESAMLVPFFKPPGL 

- — EEKLAKEN . ■" : 



FIG. 35 
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SEQUENCE LISTING 

<110> Lassner, Michael 

Post-Beittenmiller, Martha 
Savidge, Beth 
Weiss, James 

<120> Nucleic Acid Sequences Involved in 
Tocopherol Synthesis 

<130> 17133/00/WO 

<150> 60/129,899 
<151> 1999-04-15 

<150> 60/146,461 
<151> 1999-07-30 

<150> PCT/USOO/10368 
<151> 2000-04-14 

<160> 94 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1182 
<212> DNA 

<213> Arabidopsis sp 
<400> 1 

atggagtctc tgctctctag ttcttctctt gtttccgctg ctggtgggtt ttgttggaag 
60 

aagcagaatc taaiagctcca ctctttatca gaaatccgag ttctgcgttg tgattcgagt 
120 

aaagttgtcg caaaaccgaa gtttaggaac aatcttgtta ggcctgatgg tcaaggatct 
180 

tcattgttgt tgtatccaaa acataagtcg agatttcggg ttaatgccac tgcgggtcag 
240 

cctgaggctt tcgactcgaa tagcaaacag aagtctttta gagactcgtt agatgcgttt 
300 

tacaggtttt ctaggcctca tacagttatt ggcacagtgc ttagcatttt atctgtatct 
360 

ttcttagcag tagagaaggt ttctgatata tctcctttac ttttcactgg catcttggag 
420 

gctgttgttg cagctctcat gatgaacatt tacatagttg ggctaaatca gttgtctgafc 
480 

gttgaaatag ataaggttaa caagccctat cttccattgg catcaggaga atattctgtt 
540 

aacaccggca ttgcaatagt agcttccttc tccatcatga gtttctggct tgggtggatt 
600 

gttggttcat ggccattgtfc ctgggctctt httgtgagtt tcatgctcgg tactgcatac 
660 

fcctatcaatt tgccactttt acggtggaaa agatttgcat tggttgcagc aatgtgtatc 
720 

ctcgctgtcc gagctattat tgttcaaatc gccttttatc tacatattca gacacatgtg 
780 

ttfcggaagac caatcttgtt cactaggcct cttattttcg ccactgcgtt tatgagcttt 
840 

ttctctgtcg ttattgcatt gtttaaggat atacctgata tcgaagggga taagatattc 
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900 

ggaatccgat cattctctgt aactctgggt cagaaacggg tgttfcfcggac atgtgttaca 
960 

ctacttcaaa tggcttacgc tgttgcaatt ctagttggag ccacafccfccc attcatatgg 
1020 

agcaaagtca tctcggtfcgh gggtcatgtt atactcgcaa caacfcfctgtg ggctcgagct 
1080 

aagtccgbbg atctgagbag caaaaccgaa ataacttcat gtbatabgtk catabggaag 
1140 

cbctttbatg cagagtacbt gcbgbbaccb tttttgaagt ga 
1182 

<210> 2 
<211> 393 
<212> PRT 

<213> Arabidopsis sp 



<400> 2 



Met 


Glu 


Ser 


Leu 


Leu 


Ser 


Ser 


Ser 


Ser 


Leu 


Val 


Ser 


Ala 


Ala 


Gly 


Gly 


1 








5 










10 










15 


Phe 


Cys 


Trp 


Lys 
20 


Lys 


Gin 


Asn 


Leu 


Lys 
25 


Leu 


His 


Ser 


Leu 


Ser 
30 


Glu 


He 


Arg 


Val 


Leu 
35 


Arg 


Cys 


Asp 


Ser 


Ser 
40 


Lys 


Val 


Val 


Ala 


Lys 
45 


Pro 


Lys 


Phe 


Arg 


Asn 


Asn 


Leu 


Val 


Arg 


Pro 


Asp 


Gly 


Gin 


Gly 


Ser 


Ser 


Leu 


Leu 


Leu 




50 










55 








60 










Tyr 


Pro 


Lys 


His 


Lys 


Ser 


Arg 


Phe 


Arg 


Val 


Asn 


Ala 


Thr 


Ala 


Gly 


Gin 


65 










70 








75 










80 


Pro 


Glu 


Ala 


Phe 


Asp 


Ser 


Asn 


Ser 


Lys 


Gin 


Lys 


Ser 


Phe 


Arg 


Asp 


Ser 










85 








90 








95 




Leu 


Asp 


Ala 


Phe 
100 


Tyr 


Arg 


Phe 


Ser 


Arg 
105 


Pro 


His 


Thr 


Val 


He 
110 


Gly 


Thr 


Val 


Leu 


Ser 
115 


He 


Leu 


Ser 


Val 


Ser 
120 


Phe 


Leu 


Ala 


Val 


Glu 
125 


Lys 


Val 


Ser 


Asp 


lie 


Ser 


Pro 


Leu 


Leu 


Phe 


Thr 


Gly 


He 


Leu 


Glu 


AXa 


Val 


val 


Axa 




130 










135 








140 










Ala 


Leu 


Met 


Met 


Asn 


He 


Tyr 


He 


Val 


Gly 


Leu 


Asn 


Gin 


Leu 


Ser 


Asp 


145 










150 






155 










160 


Val 


Glu 


He 


Asp 


Lys 


Val 


Asn 


Lys 


Pro 


Tyr 


Leu 


Pro 


Leu 


Ala 


Ser 


Gly 










165 








170 










175 




Glu 


Tyr 


Ser 


Val 


Asn 


Thr 


Gly 


He 


Ala 


He 


Val 


Ala 


Ser 


Phe 


Ser 


He 






180 








185 










190 






Met 


Ser 


Phe 
195 


Trp 


Leu 


Gly 


Trp 


He 
200 


Val 


Gly 


Ser 


Trp 


Pro 
205 


Leu 


Phe 


Trp 


Ala 


Leu 


Phe 


Val 


Ser 


Phe 


Met 


Leu 


Gly 


Thr 


Ala 


Tyr 


Ser 


lie 


Asn 


Leu 




210 










215 








220 










Pro 


Leu 


Leu 


Arg 


Trp 


Lys 


Arg 


Phe 


Ala 


Leu 


Val 


Ala 


Ala 


Met 


Cys 


He 


225 










230 








235 










240 


Leu 


Ala 


Val 


Arg 


Ala 


He 


He 


Val 


Gin 


He 


Ala 


Phe 


Tyr 


Leu 


His 


He 








245 










250 








255 




Gin 


Thr 


His 


Val 


Phe 


Gly 


Arg 


Pro 


He 


Leu 


Phe 


Thr 


Arg 


Pro 


Leu 


He 








260 






265 










270 






Phe 


Ala 


Thr 
275 


Ala 


Phe 


Met 


Ser 


Phe 
280 


Phe 


Ser 


Val 


Val 


He 
285 


Ala 


Leu 


Phe 


Lys 


Asp 
290 


He 


Pro 


Asp 


He 


Glu 
295 


Gly 


Asp 


Lys 


He 


Phe 
300 


Gly 


He 


Arg 


Ser 


Phe 


Ser 


Val- 


Thr 


Leu 


Gly 


Gin 


Lys 


Arg 


Val 


Phe 


Trp 


Thr 


Cys 


Val 


Thr 


305 










310 










315 










320 


Leu 


Leu 


Gin 


Met 


Ala 


Tyr 


Ala 


Val 


Ala 


He 


Leu 


Val 


Gly 


Ala 


Thr 


Ser 










325 








330 










335 




Pro 


Phe 


He 


Trp 


Ser 


Lys 


Val 


He 


Ser 


Val 


Val 


Gly 


His 


Val 


He 


Leu 
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340 345 350 

Ala Thr Thr Leu Tirp Ala Arg Ala Lys Ser Val Asp Leu Ser Ser Lys 

355 360 365 

Thr Glu He Thr Ser Cys Tyr Met Phe He Trp Lys Leu Phe Tyr Ala 

370 375 380 

Glu Tyr Leu Leu Leu Pro Phe Leu Lys 
385 390 

<210> 3 
<211> 1224 
<212> DNA 

<213> Arabidopsis sp 
<400> 3 

atggcgtttt ttgggctctc ccgtgtttca agacggttgt tgaaafccttc cgtctccgta 
60 

acbccatctt cttcctctgc fccttttgcaa tcacaacata aatccttgtc caatcctgtg 
120 

actacccatt acacaaatcc tttcactaag tgttabcctt catggaatga taattaccaa 
180 

gtatggagta aaggaagaga attgcatcag gagaagfcttt ttggtgttgg ttggaattac 
240 

agattaattt gtggaatgtc gtcgtcttct tcggttttgg agggaaagcc gaagaaagat 
300 

gataaggaga agagtgatgg tgttgttgtt aagaaagctt cttggataga tbtgtattta 
360 

ccagaagaag ttagaggtta tgctaagctt gctcgattgg ataaacccat tggaacttgg 
420 

ttgcttgcgt ggccttgtat gtggfccgatt gcgttggctg ctgatcctgg aagccttcca 
480 

agttttaaat atatggcttt atttggttgc ggagcattac ttcthagagg tgctggttgt 
540 

actataaatg atchgcttga tcaggacata gatacaaagg ttgatcgtac aaaactaaga 
600 

cctatcgcca gtggtctttt gacaccattt caagggattg gatttctcgg. gctgcagttg 
660 

cttttaggct tagggattct tctccaactt aacaattaca gccgtgtttt aggggcttca 
720 

tctttgttac ttgtcttttc ctacccactt atgaagaggt ttacattttg gcctcaagcc 
780 

tttttaggtt tgaccataaa ctggggagca ttgttaggat ggactgcagt taaaggaagc 
840 

atagcaccat ctahtgtact ccctctctafc ctctccggag tctgcfcggac ccttgtttat 
900 

gabactattt atgcacatca ggacaaagaa gatgatgtaa aagttggtgt taagtcaaca 
960 . 

gcccktagab bcggtgabaa tacaaagctt tggttaactg gatttggcac agcatccaba 
1020 

ggbttbcttg cactttctgg attcagtgca gatctcgggt ggcaatatta cgcatcacbg 
1080 

gccgcbgcab caggacagtb aggatggcaa abagggacag ctgacttatc atctggtgcb 
1140 

gacbgcagba gaaaatbtgb gtcgaacaag bggbtbggbg cbattatatt tagbggagtb 
1200 

gtacbtggaa gaagbbttca ataa 
1224 

<210> 4 

<211> 407 

<212> PRT 

<213> Arabidopsis sp 
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<400> 4 ^ 

Met Ala Phe Phe Gly Leu Ser Arg Val Ser Arg Arg Leu Leu Lys Ser 

15 10 15 

Ser Val Ser Val Thr Pro Ser Ser Ser Ser Ala Leu Leu Gin Ser Gin 

20 25 30 

His Lys Ser Leu Ser Asn Pro Val Thr Thr His Tyr Thr Asn Pro Phe 

35 40 . 45 

Thr Lys Cys Tyr Pro Ser Trp Asn Asp Asn Tyr Gin Val Trp Ser Lys 

50 .55 60 

Gly Arg Glu Leu His Gin Glu Lys Phe Phe Gly Val Gly Trp Asn Tyr 
65 70 75 80 

Arg Leu He Cys Gly Met Ser Ser Ser Ser Ser Val Leu Glu Gly Lys 

85 90 95 

Pro Lys Lys Asp Asp Lys Glu Lys Ser Asp Gly Val Val Val Lys Lys 

100 105 110 

Ala Ser Trp He Asp Leu Tyr Leu Pro Glu Glu Val Arg Gly Tyr Ala 

115 120 125 

Lys Leu Ala Arg Leu Asp Lys Pro He Gly Thr Trp Leu Leu Ala Trp 

130 135 140 

Pro Cys Met Trp Ser He Ala Leu Ala Ala Asp Pro Gly Ser Leu Pro 
145 150 155 160 

Ser Phe Lys Tyr Met Ala Leu Phe Gly Cys Gly Ala Leu Leu Leu Arg 

165 170 175 

Gly Ala Gly Cys Thr lie Asn Asp Leu Leu Asp Gin Asp He Asp Thr 

180 185 190 

Lys Val Asp Arg Thr Lys Leu Arg Pro He Ala Ser Gly Leu Leu Thr 

195 200 205 

Pro Phe Gin Gly He Gly Phe Leu Gly Leu Gin Leu Leu Leu Gly Leu 

210 215 220 

Gly He Leu Leu Gin Leu Asn Asn Tyr Ser Arg Val Leu Gly Ala Ser 
225 230 235 240 

Ser Leu Leu Leu Val Phe Ser Tyr Pro Leu Met Lys Arg Phe Thr Phe 

245 250 255 

Trp Pro Gin Ala Phe Leu Gly Leu Thr He Asn Trp Gly Ala Leu Leu 

260 265 270 

Gly Trp Thr Ala Val Lys Gly Ser He Ala Pro Ser He Val Leu Pro 

275 280 285 

Leu Tyr Leu Ser Gly Val Cys Trp Thr Leu Val Tyr Asp Thr He Tyr 

290 295 300 

Ala His Gin Asp Lys Glu Asp Asp Val Lys Val Gly Val Lys Ser Thr 
305 310 315 320 

Ala Leu Arg Phe Gly Asp Asn Thr Lys Leu Trp Leu Thr Gly Phe Gly 

325 330 335 

Thr Ala Ser He Gly Phe Leu Ala Leu Ser Gly Phe Ser Ala Asp Leu 

340 345 350 

Gly Trp Gin Tyr Tyr Ala Ser Leu Ala Ala Ala Ser Gly Gin Leu Gly 

355 360 365 

Trp Gin He Gly Thr Ala Asp Leu Ser Ser Gly Ala Asp Cys Ser Arg 

370 375 380 

Lys Phe Val Ser Asn Lys Trp Phe Gly Ala He He Phe Ser Gly Val 
385 390 395 400 

Val Leu Gly Arg Ser Phe Gin 
405 

<210> 5 
<211> 1296 
<212> DNA 

<213> Arabidopsis sp 
<400> 5 
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atgbggcgaa gatcfcgttgt ttctcgbtta tcttcaagaa tctctgtttc ttcttcgtta 
60 

ccaaacccta gactgattcc ttggtcccgc gaattatgtg ccgttaatag cttctcccag 
120 

ccbccggtct cgacggaatc aactgctaag ttagggatca ctggtgttag atctgatgcc 
180 

aafccgagttt ttgccacbgc tacbgccgcc gctacagcba cagctaccac cggbgagabb 
240 

tcgtctagag btgcggcbbb ggcbggabba gggcatcacb acgctcgbbg bbabbgggag 
300 

cbbbcbaaag cbaaacbbag babgctbgbg gbtgcaacbb cbggaacbgg gbabatbcbg 
360 

ggbacgggaa abgcbgcaab bagcbbcccg gggcbbbgbb acacabgbgc aggaaccatg 
420 

abgatbgcbg cabcbgcbaa bbccbbgaab cagabbbtbg agabaagcaa bgabbcbaag 
480 

abgaaaagaa cgabgcbaag gccabbgccb bcaggacgta bbagtgbbcc acacgcbgbt 
540 . 

gcabgggcba ctatbgcbgg bgcbbcbggt gcttgbbbgb bggccagcaa gactaabatg 
600 

btggctgcbg gacfctgcabc bgccaabcbt gtacbbbatg cgbbbgbbba bactccgbtg 
660 

aagcaacbbc accctatcaa bacabgggbb ggcgcbgbtg btggbgctab cccacccbbg 
720 

cbbgggbggg cggcagcgbc bggbcagabt bcabacaabb cgatgatbct bccagcbgct 
780 

cbfcbacbbbb ggcagatacc bcabbbbabg gcccbbgcac atcbcbgccg caatgabtab 
840 

gcagcbggag gbbacaagab gbtgbcactc bbbgabccgb cagggaagag aabagcagca 
900 

gbggcbcbaa ggaacbgcbb btacabgabc ccbcbcggtb bcabcgccba bgacbggggg 
960 

bbaaccbcaa gbtggbbtbg ccbcgaabca acacbbcbca cacbagcaab cgctgcaaca 
1020 

gcabbbbcab bctaccgaga ccggaccabg cabaaagcaa ggaaaabgbb ccabgccagb 
1080 

cbbcbcbbcc ttccbgtbbt cabgbcbggb cbbcbbcbac accgbgbctc baatgabaab 
1140 

cagcaacaac bcgbagaaga agccggabba acaaabbcbg babcbggbga agbcaaaacb 
1200 

cagaggcgaa agaaacgtgt ggcbcaaccb ccggbggcbb abgccbcbgc bgcaccgbbb 
1260 

ccbbbccbcc cagcbccbbc cbbcbacbcb ccatga 
1296 

<210> 6 
<211> 431 
<212> PRT 

<213> Arabidopsis sp 
<400> 6 

Meb Trp Arg Arg Ser Val Val Tyr Arg Phe Ser Ser Arg He Ser Val 

15 10 15 

Ser Ser Ser Leu Pro Asn Pro Arg Leu He Pro Trp Ser Arg Glu Leu 

20 25 30 

Cys Ala Val Asn Ser Phe Ser Gin Pro Pro Val Ser Thr Glu Ser Thr 

35 40 45 

Ala Lys Leu Gly He Thr Gly Val Arg Ser Asp Ala Asn Arg Val Phe 

50 55 60 

Ala Thr Ala Thr Ala Ala Ala Thr Ala Thr Ala Thr Thr Gly Glu He 
65 70 75 80 
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Ser Ser Arg Val Ala Ala Leu Ala Gly Leu Gly His His Tyr Ala Arg. 

85 90 95 . 

Cys Tyr Trp Glu Leu Ser Lys Ala Lys Leu Ser Met Leu Val Val Ala 

100 105 110 

Thr Ser Gly Thr Gly Tyr lie Leu Gly Thr Gly Asn Ala Ala lie Ser 

115 .120 125 

Phe Pro Gly Leu Cys Tyr Thr Cys Ala Gly Thr Met Met He Ala Ala 

130 135 140 

Ser Ala Asn Ser Leu Asn Gin He Phe Glu He Ser Asn Asp Ser Lys 
145 150 155 160 

Met Lvs Arq Thr Met Leu Arg Pro Leu Pro Ser Gly Arg He Ser Val 

165 170 175 

Pro His Ala Val Ala Trp Ala Thr He Ala Gly Ala Ser Gly . Ala Cys 

180 185 190 

Leu Leu Ala Ser Lys Thr Asn Met Leu Ala Ala Gly Leu Ala Ser Ala 

195 200 205 

Asn Leu Val Leu Tyr Ala Phe Val Tyr Thr Pro Leu Lys Gin Leu His 

210 215 220 

Pro He Asn Thr Trp Val Gly Ala Val Val Gly Ala .He Pro Pro Leu 
225 230 235 240 

Leu Gly Trp Ala Ala Ala Ser Gly Gin He Ser Tyr Asn Ser Met He 

245 250 255 

Leu Pro Ala Ala Leu Tyr Phe Trp Gin He Pro His Phe Met Ala Leu 

260 265 270 

Ala His Leu Cys Arg Asn Asp Tyr Ala Ala Gly Gly Tyr Lys Met Leu 

275 280 285 

Ser Leu Phe Asp Pro Ser Gly Lys Arg He Ala Ala Val Ala Leu Arg 

290 295 300 

Asn Cys Phe Tyr Met He Pro Leu Gly Phe He Ala Tyr Asp Trp Gly 
305 310 315 320 

Leu Thr Ser Ser Trp Phe Cys Leu Glu Ser Thr Leu Leu Thr Leu Ala 

325 330 335 

He Ala Ala Thr Ala Phe Ser Phe Tyr Arg Asp Arg Thr Met His Lys 

340 345 350 

Ala Arg Lys Met Phe His Ala Ser Leu Leu Phe Leu Pro Val Phe Met 

355 360 365 

Ser Gly Leu Leu Leu His Arg Val Ser Asn Asp Asn Gin Gin Gin- lieu 

370 375 .380 

Val Glu Glu Ala Gly Leu Thr Asn Ser Val Ser Gly Glu Val Lys Thr 
385 390 395 400 

Gin Arg Arg Lys Lys Arg Val Ala Gin Pro Pro Val Ala Tyr Ala Ser 

405 410 415 

Ala Ala Pro Phe Pro Phe Leu Pro Ala Pro Ser Phe Tyr Ser Pro 
420 425 430 

<210> 7 
<211> 479 
<212> DNA 

<213> Arabidopsis sp 
<400> 7 

ggaaactccc ggagcacctg tttgcaggta ccgctaacct taatcgataa tttatttctc 
60 

ttgtcaggaa ttatgtaagt ctggtggaag gctcgcatac catttttgca ttgcctttcg 
120 

ctatgatcgg gtttactttg ggtgtgatga gaccaggcgt ggctttatgg tatggcgaaa 
180 

acccattttt atccaatgct gcattccctc ccgatgattc gttctttcat tcctatacag 
240 

gtatcatgct gataaaactg ttactggtac tggtttgtat ggtatcagca agaagcgcgg 
300 
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cgatggcgtt taaccggtat ctcgacaggc attttgacgc gaagaacccg cgtactgcca 
360 

tccgtgaaat acctgcgggc gtcatatctg ccaacagtgc gctggtgttt acgataggct 
420 

gcfcgcgtggt attctgggtg gcctgttatt tcafcfcaacac gatctgthtt fcacctggcg 
479 

<210> 8 
<211> 551 
<212> DNA 

<213> Arabidopsis sp 
<220> 

<221> misc_feature 
<222> (1)...(551) 
<223> n » A,T,C or G 

<400> 8 

ttgtggctta caccttaatg agcatacgcc agnccattac ggctcgttaa tcggcgccat 
60 

ngccggngct gntgcaccgg tagtgggcta ctgcgccgtg accaatcagc ttgatctagc 
120 

ggctcttatt ctgtttttaa ttttactgtt ctggcaaatg ccgcattttt acgcgatttc 
180 

cattttcagg ctaaaagact tttcagcggc ctgtattccg gfcgctgccca tcatfcaaaga 
240 

cctgcgctat accaaaakca gcatgctggt ttacgtgggc ttatttacac tggcfcgctafc 
300 

catgccggcc ctcttagggt abgccggttg gatttatggg atagcggcct taattttagg 
360 

cttgtattgg ctttatattg ccatacaagg attcaagacc gccgatgatc aaaaatggtc 
420 

tcgtaagatg tttggatctt cgattfctaat cattaccctc ttgfccggtaa tgatgcttgt 
480 

ttaaacttac tgcctcctga agtttatata tcgataattt cagcttaagg aggcttagtg 
540 

gttaattcaa fc 
551 

<210> 9 
<211> 297 
<212> PRT 

<213> Arabidopsis sp 
<400> 9 

Met Val Leu Ala Glu Val Pro Lys Leu Ala Ser Ala Ala Glu Tyr Phe 

1 5 10 15 

Phe Lys Arg Gly Val Gin Gly Lys Gin Phe Arg Ser Thr lie Leu Leu 

20 25 30 

Leu Met Ala Thr Ala Leu Asn Val Arg Val Pro Glu Ala Leu lie Gly 

35 40 45 

Glu Ser Thr Asp lie Val Thr Ser Glu Leu Arg Val Arg Gin Arg Gly 

50 55 60 

lie Ala Glu He Thr Glu Met He His Val Ala Ser Leu Leu His Asp 
65 70 75 80 

Asp Val Leu Asp Asp Ala Asp Thr Arg Arg Gly Val Gly Ser Leu Asn 

85 90 95 

Val Val Met Gly Asn Lys Val Val Ala Leu Leu Ala Thr Ala Val Glu 

100 105 110 

His Leu Val Thr Gly Glu Thr Met Glu He Thr Ser Ser Thr Glu Gin 
115 120 125 
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Arg Tyr Ser Met Asp Tyr Tyr Met Gin Iiys Thr Tyr.Tyr Lys Thr Ala 

130 135 140 

Ser Leu He Ser Asn Ser Cys Lys Ala Val Ala Val Leu Thr Gly Gin 
145 150 155 160 

Thr Ala Glu Val Ala Val Leu Ala Phe Glu Tyr Gly Arg Asn Leu Gly 

165 170 175 

Leu Ala Phe Gin Leu He Asp Asp He Leu Asp Phe Thr Gly Thr Ser 

180 185 190 

Ala Ser Leu Gly Lys Gly Ser Leu Ser Asp He Arg His Gly Val He 

195 200 205 

Thr Ala Pro He Leu Phe Ala Met Glu Glu Phe Pro Gin Leu Arg Glu 

210 215 220 

Val Val Asp Gin Val Glu Lys Asp Pro Arg Asn Val Asp He Ala Leu 
225 230 235 240 

Glu Tyr Leu Gly. Lys Ser Lys Gly He Gin Arg Ala Arg Glu Leu Ala 

245 250 255 

Met Glu His Ala Asn Leu Ala Ala Ala Ala He Gly Ser Leu Pro Glu 

260 265 270 

Thr Asp Asn Glu Asp Val Lys Arg Ser Arg Arg Ala Leu He Asp Leu 

275 280 285 

Thr His Arg Val He Thr Arg Asn Lys 
290 295 

<210> 10 
<211> 561 
<212> DMA 

<213> Arabidopsis sp 
<400> 10 

aagcgcatcc gtcctctt 
60 

gcccctatcc gctcgcgc 
120 

acgtcgtcga tgaaagcc 
180 

atcaggcgag cgtgctcc 
240 

aagacggctc gctcgacc 
300 

gcgaagtgat gcagctcc 
360 

atgtgatcag cgcgaage 
420 

tggcgaacgc gaaggcgc 
480 

tcgccttcca gatcatcc 
540 

gcaagaacac gggcgacgat t 
561 

<210> 11 
<2H> 966 
<212> ONA 

<213> Arabidopsis sp 
<400> 11 

atggtacttg ccgaggttcc aaagcttgcc tctgctgctg agtacttctt caaaaggggt 
60 

gtgcaaggaa aacagtttcg ttcaactatt ttgctgctga tggcgacagc tctgaatgta 
120 

cgcgttccag aagcattgat tggggaatca acagatatag tcacatcaga attacgcgta 



acgattgccg 


ccagccgcat 


gtatggctgc 


ataaccgacc 


gcggtcgaat 


tcattcacac 


cgcgacgctg 


ctgcatgacg 


ttgcgccgcg 


gccgcgaaag 


cgcgcataag 


gttttcggca 


ggcgatttcc 


ttttctcccg 


cgccttccag 


ctgatggtgg 


ctgcgcattc 


tctcggatgc 


ctccgccgtg 


atcgcgcagg 


accgcgcgca 


atcttgaaac 


caatatgagc 


cagtatctcg 


gccgcgctct 


ttgccgccgc 


ctgcgaaatc 


ggcccggtga 


gatgctgccg 


cgatgtgcga 


atacggcatg 


aatctcggta 


gaccttctcg 


attacggcac 


cggcggccac 


gccgagcttg 
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180 

aggcaacggg gtattgctga aatcactgaa 
240 

gatgtcttgg atgatgccga tacaaggcgt 
300 

aacaagatgt cggtattagc aggagacttc 
360 

gctttaaaga acacagaggt tgtagcatta 
420 

ggtgaaacca tggaaataac tagttcaacc 
480 

cagaagacat abbataagac agcatcgcba 
540 

ctcactggac aaacagcaga agtbgccgtg 
600 

ttagcattcc aattaataga cgacattctt 
660 

aagggatcgb tgtcagatat tcgccabgga 
720 

gaagagtttc ctcaactacg cgaagttgtt 
780 

gacattgctt tagagtatcfc tgggaagagc 
840 

atggaacatg cgaatctagc agcagctgca 
900 

gatgtcaaaa gatcgaggcg ggcacttatt 
960 

aagtga 
966 

<210> 12 
<211> 321 
<212> PRT 

<213> Arabidopsis sp 



<400> 12 



Met 


Val 


Leu 


Ala 


Glu 


Val 


Pro 


Lys 


Leu 


Ala 


Ser 


Ala 


Ala 


Glu 


Tyr 


Phe 


1 








5 










10 










15 




Phe 


Lys 


Arg 


Gly 


Val 


Gin 


Gly 


Lys 


Gin 


Phe Arg 


Ser 


Thr 


He 


Leu 


Leu 








20 










25 










30 






Leu 


Met 


Ala 


Thr 


Ala 


Leu 


Asn 


Val 


Arg 


Val 


Pro 


Glu 


Ala 


Leu 


He 


Gly 






35 










40 








45 






Glu 


Ser 
50 


Thr 


Asp 


lie 


Val 


Thr 
55 


Ser 


Glu 


Leu 


Arg 


Val 
60 


Arg 


Gin 


T^g 


Gly 


He 


Ala 


Glu 


He 


Thr 


Glu 


Met 


He 


His 


Val 


Ala 


Ser 


Leu 


Leu 


His 


Asp 


65 










70 










75 










80 


Asp 


Val 


Leu 


Asp 


Asp 
85 


Ala 


Asp 


Thr 


Arg 


Arg 
90 


Gly 


Val 


Gly 


Ser 


Leu 
95 


Asn 


Val 


Val 


Met 


Gly 


Asn 


Lys 


Met 


Ser 


Val 


Leu 


Ala 


Gly 


Asp 


Phe 


Leu 


Leu 








100 










105 








110 






Ser 


Arg 


Ala 


Cys 


Gly 


Ala 


Leu 


Ala 


Ala 


Leu 


Lys 


Asn 


Thr 


Glu 


Val 


Val 






115 








120 








125 








Ala 


Leu 


Leu 


Ala 


Thr 


Ala 


Val 


Glu 


His 


Leu 


Val 


Thr 


Gly 


Glu 


Thr 


Met 




130 










135 










140 








Glu 


He 


Thr 


Ser 


Ser 


Thr 


Glu 


Gin 


Arg 


Tyr 


Ser 


Met 


Asp 


Tyr 


Tyr 


Met 


145 










150 








155 










160 


Gin 


Lys 


Thr 


Tyr 


Tyr 
165 


Lys 


Thr 


Ala 


Ser 


Leu 
170 


He 


Ser 


Asn 


Ser 


Cys 
175 


Lys 


Ala 


Val 


Ala 


Val 


Leu 


Thr 


Gly 


Gin 


Thr 


Ala 


Glu 


Val 


Ala 


Val 


Leu 


Ala 








180 








185 










190 






Phe 


Glu 


Tyr 


Gly 


Arg 


Asn 


Leu 


Gly 


Leu 


Ala 


Phe 


Gin 


Leu 


He 


Asp 


Asp 



atgabacacg tcgcaagtct actgcacgat 
ggtgttggtt ccttaaatgt tgtaatgggt 
ttgctctccc gggcttgtgg ggctctcgct 
cttgcaactg ctgtagaaca tcttgttacc 
gagcagcgtt atagtatgga ctactacabg 
atctctaaca gctgcaaagc tgttgccgtt 
ttagcttttg agtatgggag gaatctgggt 
gatttcacgg gcacatctgc ctctctcgga 
gtcataacag ccccaatcct ctttgccatg 
gatcaagttg aaaaagatcc taggaatgtt 
aagggaatac agagggcaag agaattagcc 
atcgggtctc tacctgaaac agacaatgaa 
gacttgaccc atagagtcat caccagaaac 



wo 02/33060 



PCT/USOl/42673 



1 QK 




*5 An 
2UU 




one 


XXe Leu Asp 


Phe Thr Gly 


Thr Ser Ala 


Ser Leu 


Gxy Lys .vxy oer jjeu 






Zl5 




*i 0 n 
220 


Caw T1« 

oer Asp xie 


Arg His Gly 


Val He Thr 


Ala Pro 


ixe Leu ciie Axa nee 


0 0 c 
225 


0 *a 




235 




Glu Pne 


Pro Gin Leu 


Arg Glu Val 


Val Asp 


Gin vaj. Gxu Jjys Asp 




245 


250 




Pro Arg Asn 


Val Asp He 


Ala Leu Glu 


Tyr Leu 


Gly Lys Ser Lys Gly 




2bU 


265 






lie Gin Arg 


Ala Arg Glu 


Leu Ala Met 


Glu His 


Ala Asn Leu Ala Ala 


275 


280 






Vt^^ 

Ala Ala lie 


Gly Ser Leu 


Pro Glu Thr Asp Asn 


Glu Asp Val Lys Arg 


290 




295 






Ser Arg Arg 


Ala Leu He 


Asp Leu Thr 


His Arg 


Val He Thr Arg Asn 


305 


310 




315 


320 


Lys 










<210> 13 










<211> 621 










<212> DNA 











<213> Arabidopsis sp 
<400> 13 

gctttctcct ttgctaattc ttgagctttc ttgatcccac cgcgatttct aactatttca 
60 

afccgcttctt caagcgatcc aggctcacaa aactcagact caatgatctc tcttagcctt 
120 

ggctcattct ctagcgcgaa gatcactggc gecgttatgt tacctttggc taagbcatta 
180 

gctgcaggct tacctaactg ctctgtggac tgagtgaagt ccagaatgtc atcaactacb 
240 

fcgaaaagata aaccgagabt cttcccgaac tgatacattt gcfcctgcgac cttgctttcg 
300 

actttactga aaattgctgc tcctttggtg cttgcagcta ctaatgaagc tgtcttgfcag 
360 

taactcttta gcatgtagtc atcaagcttg acatcacaat cgaataaact cgatgcttgc 
420 

tttatctcac cgcttgcaaa atctttgatc acctgcaaaa agataaatca agattcagac 
480 

caaatgttct ttgtattgag tagcttcatc taatctcaga aaggaatatt acctgactta 
540 

tgagcttaat gacttcaagg ttfctcgagat ttgtaagtac catgatgctt gagcaacatg 
600 

aaatccccag ctaatacagc t 
621 

<210> 14 
<211> 741 
<212> DNA 

<213> T^abidopsis sp 
<400> 14 

ggtgagtttt gttaatagtt atgagattca tctatttttg tcataaaatt gtttggtttg 
60 

gtttaaactc tgtgtataat tgcaggaaag gaaacagttc atgagctttt cggcacaaga 
120 

gtagcggtgc fcagctggaga tttcatgttt gctcaagcgt catggtactt agcaaatctc 
180 

gagaatcttg aagttattaa gctcatcagt caggtactta gttactctta cattgttttt 
240 
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cfcatgaggtt gagcfcatgaa tcfccatttcg ttgaataatg ctgtgcctca aacttttttt 
300 

catgbtttca ggtgatcaaa gactttgcaa gcggagagat aaagcaggcg tccagcbtat 
360 

ttgactgcga caccaagctc gacgagtact tactcaaaag tttctacaag acagcctcbt 
420 

tagtggctgc gagcaccaaa ggagctgcca fctttcagcag agtbgagccb gabgbgacag 
480 

aacaaabgba cgagbfcbggg aagaatcbcg gbcbctcbbb ccagatagtb gabgabattb 
540 

bggafcfcbcac tcagtcgaca gagcagctcg ggaagccagc agggaghgab bbggctaaag 
600 

gbaacbtaac agcaccbgtg abttfccgctc bggagaggga gccaaggcba agagagabca 
660 

ttgagtcaaa gbbctgbgag gcgggbtcbc tggaagaagc gabtgaagcg gbgacaaaag 
720 

gtggggggat taagagagca c 
741 

<210> 15 
<211> 1087 
<212> DNA 

<213> Arabidopsis sp 
<400> 15 

cctcttcagc caatccagag gaagaagaga caacbttbta tcbbbcgtca agagtctccg 
60 

aaaacgcacg gbbtbatgct ctctcttctg cccbcacctc acaagacgca gggcacabga 
120 

tbcaaccaga gggaaaaagc aacgabaaca actcbgcbbt tgatbbcaag cbgtatatga 
180 

bccgcaaagc cgagtctgta aabgcggctc tcgacgbbtc cgbaccgctb ctgaaacccc 
240 

tbacgabcca agaagcggbc aggbactcbb tgctagccgg cggaaaacgb gtgaggcctc 
300 

bgchcbgcat tgccgctbgt gagctbgtgg ggggcgacga ggcbactgcc atgtcagccg 
360 

cbbgcgcggb cgagatgatc cacacaagct ctcbcabbca bgacgatctt ccgtgcabgg 
420 

acaatgccga ccbccgtaga ggcaagccca ccaabcacaa ggtabgttgb ttaatbatab 
480 

gaaggctcag agafcaatgcb gaacbagtgb bgaaccaatb bttgctcaaa caaggtatab 
540 

ggagaagaca tggcggbbbb ggcaggtgab gcacbccbbg catbggcghb bgagcacabg 
600 

acggtbgbgb cgagbgggtt ggbcgctccc gagaagatga tbcgcgccgb ggbbgagctg 
660 

gccagggcca tagggactac agggctagtt gctggacaaa tgabagaccb agccagcgaa 
.720 

agacbgaabc cagacaaggt bggattggag cabcbagagt bcabccabct ccacaaaacg 
780 

gcggcabbgb bggaggcagc ggcagbtbba ggggbbabaa bgggaggtgg aacagaggaa 
840 

gaaabcgaaa agctbagaaa gtabgctagg bgbabbggac bactgbbtca ggbbgbbgafc 
900 

gacabbcbcg acgbaacaaa abctactgag gaabbgggba agacagccgg aaaagacgba. 
960 

abggccggaa agcbgacgba bccaaggcbg abaggbbtgg agggabccag ggaagbbgca 
1020 

gagcaccbga ggagagaagc agaggaaaag cbbaaagggb bbgabccaag tcaggcggcg 
1080 
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cctctgg 
1087 

<210> 16 
<211> 1164 
<212> DNA 

<213> Arabidopsis sp 
<400> 16 

atgacttcga ttctcaacac tgtctccacc atccactctt ccagagttac ctccgtcgat 
60 

cgagtcggag tcctctctct tcggaattcg gattccgttg agttcactcg ccggcgttct 
120 

ggtttctcga cgfctgatcfca cgaatcaccc gggcggagat ttgttgtgcg tgcggcggag 
180 

achgatactg ataaagttaa atctcagaca ccbgacaagg caccagccgg tggttcaagc 
240 

attaaccagc btctcggtab caaaggagca bctcaagaaa ctaataaatg gaagattcgt 
300 

cttcagctta caaaaccagt cacttggcct ccactggttt ggggagtcgt ctgtggtgct 
360 

gctgcttcag ggaactttca ttggacccca gaggatgttg ctaagtcgat tctttgcatg 
420 

atgatgtctg gtccttgtct tactggctat acacagacaa tcaacgactg gtatgataga 
480 

gatafccgacg caattaatga gccatatcgt ccaattccat ctggagcaat atcagagcca 
540 

gaggttatfca cacaagtcfcg ggtgctatta ttgggaggtc ttggtattgc tggaatatta 
600 

gatgtgtggg cagggcatac cactcccact gtcttctatc ttgctttggg aggatcattg 
660 

ctatcttata tatactctgc tccacctctt aagctaaaac aaaatggatg ggttggaaat 
720 

tttgcacttg gagcaagcta tattagbttg ccatggbggg cbggccaagc abbgtbtggc 
780 

actcbbacgc cagatgttgb tgbbctaaca ctctbgbaca gcatagctgg gbbaggaata 
840 

gccatbgbba acgactbcaa aagtgbtgaa ggagabagag catbaggact tcagtctctc 
900 

ccagbagctt bbggcaccga aacbgcaaaa bggabatgcg ttggtgctat agacabbacb 
960 

cagcbttcbg btgccggaba tcbatbagca bctgggaaac cbbatbatgc gbtggcgbbg 
1020 

gbbgctbbga bcabtcctca gattgtgtbc cagbbbaaat actttctcaa ggaccctgtc 
1080 

aaabacgacg bcaagtacca ggcaagcgcg cagccattct bggbgctcgg aababttgta 
1140 

acggcatbag catcgcaaca ctga 
1164 

<210> 17 
<211> 387 
<212> PRT 

<213> Arabidopsis sp 
<400> 17 

Met Thr Ser He Leu Asn Thr Val Ser Thr He His Ser Ser Arg Val 
. 1 5 10 15 

Thr Ser Val Asp Arg Val Gly Val Leu Ser Leu Arg Asn Ser Asp Ser 

20 25 30 

Val Glu Phe Thr Arg Arg Arg Ser Gly Phe Ser Thr Leu He Tyr Glu 
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35 40 .45 

Ser Pro Gly Arg Arg Phe Val Val Arg Ala Ala Glu Thr Asp Thr Asp 

50 55 60 

Lys Val Lys Ser Gin Thr Pro Asp Lys Ala Pro Ala Gly Gly Ser Ser 
65 70 75 80 

He Asn Gin Leu Leu Gly He Lys Gly Ala Ser Gin Glu Thr Asn Lys 

85 90 95 

Trp Lys He Arg Leu Gin Leu Thr Lys Pro Val Thr Trp Pro Pro Leu 

100 105 110 

Val Trp Gly Val Val Cys Gly Ala Ala Ala Ser Gly Asn Phe His Trp 

115 120 125 

Thr Pro Glu Asp Val Ala Lys Ser He Leu Cys Met Met Met Ser Gly 

130 135 140 

Pro Cys Leu Thr Gly Tyr Thr Gin Thr He Asn Asp Trp Tyr Asp Arg 
145 150 155 160 

Asp He Asp Ala He Asn Glu Pro Tyr Arg Pro He Pro Ser Gly Ala 

165 170 175 

He Ser Glu Pro Glu Val He Thr Gin Val Trp Val Leu Leu Leu Gly 

180 185 190 

Gly Leu Gly He Ala Gly He Leu Asp Val Trp Ala Gly His Thr Thr 

195 200 205 

Pro Thr Val Phe Tyr Leu Ala Leu Gly Gly Ser Leu Leu Ser Tyr He 

210 215 220 

Tyr Ser Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Val Gly Asn . 
225 230 235 240 

Phe Ala Leu Gly Ala Ser Tyr He Ser Leu Pro Trp Trp Ala Gly Gin 

245 250 255 

Ala Leu Phe Gly Thr Leu Thr Pro Asp Val Val Val Leu Thr Leu Leu 

260 265 270 

Tyr Ser He Ala Gly Leu Gly lie Ala He Val Asn Asp Phe Lys Ser 

275 280 285 

Val Glu Gly Asp Arg Ala Leu Gly Leu Gin Ser Leu Pro Val Ala Phe 

290 295 300 

Gly Thr Glu Thr Ala Lys Trp He Cys Val Gly Ala He Asp He Thr 
305 310 315 320 

Gin Leu Ser Val Ala Gly Tyr Leu Leu Ala Ser Gly Lys Pro Tyr Tyr 

325 330 335 

Ala Leu Ala Leu Val Ala Leu He He Pro Gin He Val Phe Gin Phe 

340 345 350 

Lys Tyr Phe Leu Lys Asp Pro Val Lys Tyr Asp Val Lys Tyr Gin Ala 

355 360 365 

Ser Ala Gin Pro Phe Leu Val Leu Gly He Phe Val Thr Ala Leu Ala 

370 375 380 

Ser Gin His 
385 

<210> 18 
<211> 981 
<212> DNA 

<213> Arabidopsis sp 
<400> 18 

atgttgttta gtggttcagc gatcccatta agcagctfcct gctctcttcc ggagaaaccc 
60 

cacactcttc ctafcgaaact ctctcccgct gcaatccgat cttcatcctc atctgccccg 
120 

gggtcgttga acttcgatct gaggacgtat tggacgactc tgatcaccga gatcaaccag 
180 

aagctggatg aggccatacc ggtcaagcac cctgcgggga tctacgaggc tatgagatac 
240 

tctgtactcg cacaaggcgc caagcgtgcc cctcctgtga. tgtgtgtggc ggcctgcgag 
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300 

cfccttcggtg gcgatcgcct cgccgctthc cccaccgcct gtgccctaga aatggtgcac 
360 

gcggcfctcgt tgatacacga cgaccfccccc bgtatggacg acgatcctgt gcgcagagga 
420 

aagccatcta accacactgt chacggcfcct ggcatggcca ttcbcgccgg tgacgcccbc . 
480 

tfccccacfccg ccttccagca cathgtctcc cacacgcchc cfcgaccttgt tccccgagcc 
540 

accatcctca. gactcatcac tgagattgcc cgcactgtcg gctccactgg tatggctgca 
600 

ggccagtacg tcgaccttga aggaggtccc ttbcctcttt ccbbbgbbca ggagaagaaa 
660 

tbcggagcca bgggbgaabg cbcbgccgtg bgcggtggcc battgggcgg bgccacfcgag 
720 

gabgagcbcc agagbcbccg aaggbacggg agagccgtcg ggabgcbgta bcaggtggtc 
780 

gabgacabca ccgaggacaa gaagaagagc batgabggbg gagcagagaa gggaatgatg 
840 

gaaatggcgg aagagctcaa ggagaaggcg aagaaggagc ttcaagbgbt bgacaacaag 
900 

babggaggag gagacacacb bgtbcctcbc tacaccbbcg tbgacbacgc bgcbcatcga 
960 

cabbbbcbtc ttccccbctg a 
981 

<210> 19 
<211> 245 
<212> DNA 
<213> GLycine sp 

<400> 19 

gcaacabcbg ggacbgggbb tgtctbgggg agtggtagtg ctgtbgabct tbcggcactb 
60 

bcttgcacbt gcbbgggtac catgatggbt gctgcatctg cbaactcttb gaatcaggbg 
120 

tttgagatca ataabgatgc taaaatgaag agaacaagtc gcaggccact accctcagga 
180 

cgcabcacaa bacctcatgc agtbggctgg gcabcctctg ttggabtagc bggtacggcb 

240 

ctacb 

245 

<210> 20 
<21l> 253 
<212> DNA 
<213> Glycine sp 

<400> 20 

abtggctttc caagabcabb gggtbbtcbb gbbgcabtca bgaccttcba cbccttgggb 
60 

btggcabbgb ccaaggabab acctgacgbb gaaggagaba aagagcacgg cabbgabtcb 
120 

fcfctgcagbac gbcbaggtca gaaacgggca bbbbggabbt gcgbbbccbb bbbbgaaatg 
180 

gcbbbcggag bbggbabccb ggccggagca bcabgcbcac acbbbbggac baaaabbbtc 
240 

acgggtabgg gaa 
253 

<210> 21 
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"<211> 275 
<212> DNA 
<213> Glycine sp . 

<400> 21 

tgatcttcta ctctctgggb atggcabtgt ccaaggatat atctgacgbt aaaggagata 
60 

aagcdtacgg catcgatact ttagcgakac gtbtgggtca aaaabgggta ttbtggabtt 
120 

gcabbabccb bbbbgaaabg gcbbbbggag bbgcccbcbb ggcaggagca acabctbcbb 
180 

accbbbggat baaaabbgbc acgggbcbgg gacabgcbab tcbbgcbbca abbcbcbbgb 
240 

accaagccaa abcbababac bbgagcaaca aagbt 
275 

<210> 22 
<211> 299 
<212> DNA 
<213> Glycine sp 

<220> 

<221> misc feabure 
<222> (1) .7. (299) 
<223> n = A,T,C or G 

<400> 22 

ccanaabang bncabcbbng aaagacaatb ggccbcbbca acacacaagb cbgcatgbga 
60 

agaagaggcc aabbgbcbbb ccaagabcac ttatngbggc babbgbaatc abgaacbbcb 
120 

bcbbtgbggg babggcabbg gcaaaggata taccbancbg bbgaaggaga baaaababab 
180 

ggcabbgaba ctbbbgcaat acgbataggt caaaaacaag babbbbggab bbgbabbbbc 
240 

cbbbbbgaaa ggcbbbcgga gbbtcccbag bggcaggagc aacatcbbcb agccbbggb 
299 

<210> 23 
<211> 767 
<212> DNA 
<213> Glycine sp 

<400> 23 

gbggaggcbg bggbbgcbgc ccbgbttabg aababtbaba bbgbtggtbb gaabcaabbg 
60 

bcbgabgbbg aaabagacaa gabaaacaag ccgbabcbbc cabbagcatc tggggaabab 
120 

bccbbbgaaa cbggbgbcac babbgbbgca bcbbbbbcaa bbcbgagttb tbggcbbggc 
180 

bgggbbgbag gbbcabggcc abbattbbgg gcccbbbbbg baagcbbtgb gcbaggaacb 
240 

gcbbabbcaa bcaabgbgcc bcbgbbgaga bggaagaggb bbgcagbgcb bgcagcgabg 
300 

bgcabbcfcag cbgbbcgggc agbaabagtb caacbbgcab bbbbccbbca cabgcagacb 
360 

cabgbgbaca agaggccacc bgbcbtbbca agaccabbga bbbbbgcbac bgcabbcabg 
420 

agcbbcbbcb cbgbagbbab agcacbgbbb aaggababac cbgacabbga aggagabaaa 
480 

gbabbbggca bccaabcbbb bbcagbgbgb bbaggtcaga agccggbgbb cbggacbbgb 
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540 

gttacccbtc ttgaaatagc ttatggagtc gccctccfcgg tgggagctgc atctcctbgt 
600 

cbttggagca aaattttcac gggtchggga cacgctgtgc tggcttcaat tctctggttt 
660 

catgccaaat ctgtagattfc gaaaagcaaa gcttcgafcaa catccttcta tatgbbtatt 
720 

tggaagctat bttatgcaga abacbbactc ahbccbbtbg bbagabg 
767 

<210> 24 
<2n> 255 
<212> PRT 
<213> Glycine sp 

<400> 24 

Val Giu Ala Val Val Ala Ala Leu Phe Met Asn lie Tyr He Val Gly 

1 5 10 i5 

Leu Asn Gin Leu Ser Asp Val Glu He Asp Lys He Asn Lys Pro Tyr 

20 25 30 

Leu Pro Leu Ala Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr He 

35 40 45 

Val Ala Ser Phe Ser He Leu Ser Phe Trp Leu Gly Trp Val Val Gly 

50 55 60 

Ser Trp Pro Leu Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr 
65 70 75 80 

Ala Tyr Ser He Asn Val Pro Leu Leu Arg Trp Lys Arg Phe Ala Val 

85 90 95 

Leu Ala Ala Met Cys He Leu Ala Val Arg Ala Val He Val Gin Leu 

100 105 110 

Ala Phe Phe Leu His Met Gin Thr His Val Tyr Lys Arg Pro Pro Val 

115 120 125 

Phe Ser Arg Pro Leu He Phe Ala Thr Ala Phe Met Ser Phe Phe Ser 

130 135 140 

Val Val He Ala Leu Phe Lys Asp He Pro Asp He Glu Gly Asp Lys 
145 150 155 160 

Val Phe Gly He Gin Ser Phe Ser Val Cys Leu Gly Gin Lys Pro Val 

165 170 175 

Phe Trp Thr Cys Val Thr Leu Leu Glu He Ala Tyr Gly Val Ala Leu 

180 185 190 

Leu Val Gly Ala Ala Ser Pro Cys Leu Trp Ser Lys He Phe Thr Gly 

195 200 205 

Leu Gly His Ala Val Leu Ala Ser He Leu Trp Phe His Ala Lys Ser 

210 215 220 

Val Asp Leu Lys Ser Lys Ala Ser He Thr Ser Phe Tyr Met Phe He 
225 230 235 240 

Trp Lys Leu Phe Tyr Ala Glu Tyr Leu Leu He Pro Phe Val Arg 
245 250 255 

<210> 25 
<211> 360 
<212> DNA 
<213> Zea sp 

<220> 

<221> misc_feature 
<222> (1) . . . (360) 
<223> n = A, T,C or G 

<400> 25 

ggcgtcttca cttgttctgg tcttctcgta tcccctgatg aagaggttca cattttggcc 
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60 

tcaggcttat cttggcctga cafctcaactg gggagcttta ctagggtggg ctgctattaa 
120 

ggaaagcata gaccctgcaa atcatccttc cattgtatac agctggtafct tgfctggacgc 
180 

tggtgtabga tactatatat gcgcatcagg fcgttfccgcta tccctacttt catattaatc 
240 

cttgatgaag tggccatttc atgtbgtcgc ggtggtctta tacfctgcata tctccatg.ca 
300 

tctcaggaca aagangatga cctgaaagta ggagtccaag bccacagcfct aagatttggg 
360 

<210> 26 
<211> 299 
<212> DMA 
<213> Zea sp 

<220> 

<221> misc_feature 
<222> (1) . . . (299) 
<223> n « A,T,C or G 

<400> 26 

gatggttgca gcatctgcaa ataccctcaa ccaggtgttt gngataaaaa atgatgctaa 
60 

aatgaaaagg acaatgcgtg ccccctgcca tctggtcgca ttagtcctgc acatgctgcg 
120 

atgtgggcta caagtgttgg agttgcagga acagctttgt tggcctggaa ggctaatggc 
180 

ttggcagctg ggcttgcagc ttctaabctt gbtcbgbatg cattbgtgba bacgccgttg 
240 

aagcaaabac accctgttaa tacatgggtt ggggcagtcg ttggbgccat cccaccacb 
299 

<210> 27 
<211> 255 
<212> DNA 
<213> Zea sp 

<220> 

<221> misc_feature 
<222> (1) . . . (255) 
<223> n « A,T,C or G 

<400> 27 

anacttgcab atctccabgc ntctcaggac aaagangatg acctgaaagt aggtgtcaag 
60 

tccacagcab taagatbbgg agattbgacc nnabactgna bcagtggctt tggcgcggca 

120 . 

bgcttcggca gcbtagcact cagtggbtac aabgctgacc ttggttggfcg bttagtgtga 
180 

bgcbbgagcg aagaatggta tngttbtbac ttgabatbga ctccagacct gaaatcatgb 
240 

bggacagggb ggccc 
255 

<210> 28 
<211> 257 
<212> DNA 
<213> Zea sp 
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<400> 28 

afctgaagggg ataggactct ggggcttcag tcacttcctg ttgcttttgg gatggaaact 
60 

gcaaaatgga tttgtgttgg agcaattgat atcactcaat tatctgttgc aggttaccta 
120 

ttgagcaccg gbaagctgta ttatgccctg gtgtbgcttg ggctaacaafc tcctcaggtg 
180 

tbctttcagt tccagbactt cctgaaggac cctgtgaagt atgabgtcaa atatcaggca 
240 

agcgcacaac cattctt 
257 

<210> 29 
<211> 368 
<212> DNA 
<213> Zea sp 

<400> 29 

atccagttgc aaataataat ggcgttcttc tctgttgtaa tagcactatt caaggatata 
60 

cctgacatcg aaggggaccg catattcggg atccgafccct tcagcgtccg gttagggcaa 
120 

aagaaggtct tttggatctg cgttggcttg cttgagatgg cctacagcgt tgcgatactg 
180 

atgggagcta cctcttcctg bttgtggagc aaaacagcaa ccatcgctgg ccattccata 
240 

ctbgccgcga bcctabggag ctgcgcgcga tcggtggacb tgacgagcaa agccgcaata 
300 

acgtccbbct acatgbtcafc ctggaagctg tbcbacgcgg agtacctgct catccctcbg 
360 

gtgcggtg 
368 

<210> 30 - . 
<211> 122 
<212> PRT 
<213> Zea sp 

<400> 30 

lie Gin Leu Gin lie He Met Ala Phe Phe Ser Val Val He Ala Leu 

1 5 10 15 

Phe Lys Asp He Pro Asp He Glu Gly Asp Arg He Phe Gly He Arg 

20 25 30 

Ser Phe Ser Val Arg Leu Gly Gin Lys Lys Val Phe Trp He Cys Val 

35 40 45 . 

Gly Leu Leu Glu Met Ala Tyr Ser Val Ala He Leu Met Gly Ala Thr 

50 .55 60 

Ser Ser Cys Leu Trp Ser Lys Thr Ala Thr He Ala Gly His Ser He 
65 70 75 80 

Leu Ala Ala He Leu Trp Ser Cys Ala Arg Ser Val Asp Leu Thr Ser 

85 90 95. 

Lys Ala Ala He Thr Ser Phe Tyr Met Phe He Trp Lys Leu Phe Tyr 

100 105 110 

Ala Glu Tyr Leu Leu He Pro Leu Val Arg 
115 120 



<210> 31 
<211> 278 
<212> DNA 
<213> Zea sp 
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<400> 31 

tabbcagcac caccbcbcaa gcbcaagcag aabggabgga bbgggaactb cgcbcbgggb 
60 

gcgagbbaca bcagcbbgcc cbggbgggcb ggccaggcgb babbtggaac tcbbacacca 
120 

gababcabbg bcbbgacbac bbbgbacagc abagcbgggc bagggabbgc tabbgbaaat 
180 

gabbbcaaga gbabbgaagg ggabaggacb cbggggcbbc agbcacbbcc bgbbgcbtbb 
240 

gggabggaaa cbgcaaaabg gabbbgbgbb ggagcaab 
278 

<210> 32 . 
<211> 292 
<212> PRT 

<213> Synechocysbis sp 



<400> 32 





Vox HJ.a uXn 


Thy* PT"rt 


OCX OCX XrXw 


Pro 


X)cU Xxp 


Leu 


Thv- 
xnx 


Tl <a 
XXc 


XXC 






c 




1 n 

Xv 








1 ^ 
XO 




Tyr 


Leu Leu Arg 


xrp nxs 


xiyo XTXU rVXa 


uxy 


g Xieu 


Tie 
XXC 


Leu 


nc u 


Tl o 
XXc 
















^n 








Ala Leu Trp 




jucu f\j^a 


rvxo 


Gin Glu 


xtcu 


Prn 


Prn 
cxu 


XlcU 








H U 












Prrk 


jjcu jjeu uxy 




Ala TiOii ri1\7 
r\xci XicU oxy 


Thr 

X IIX 


Leu Ala 


xnx 


^Ar 

OCX 


vxy 


- 

xieu 




DU 








DU 












uys VaX VelX 


A e n Hen 


UCU r\9^ 


Arg 


nop XX«S 


Asp 


crx V 


Gl n 


V ax 






70 






75 










Glu 


Arg Thr Lys 


Gin Arg 


Pro Leu Ala 


Ala 


Arg Ala 


Leu 


Ser 


Val 


Gin 




85 




90 






95 




Val 


Gly He Gly 


Val Ala 


Leu Val Ala 


Leu 


Leu Cys 


Ala 


Ala 


Gly 


Leu 




100 




105 








110 




Ala 


Phe Tyr Leu 


Thr Pro 


Leu Ser Phe 


Trp 


Leu Cys 


Val 


Ala 


Ala 


Val 




115 




120 






125 








Pro 


Val He Val 


Ala Tyr 


Pro Gly Ala 


Lys 


Arg Val 


Phe 


Pro 


Val 


Pro 




130 


135 




140 










Gin 


Leu Val Leu 


Ser He 


Ala Trp Gly 


Phe 


Ala Val 


Leu 


He 


Ser 


Trp 


145 




150 






155 








160 


Ser 


Ala Val Thr 


Gly Asp 


Leu Thr Asp 


Ala 


Thr Trp 


Val 


Leu 


Trp 


Gly 






165 




170 








175 




Ala 


Thr Val Phe 


Trp Thr 


Leu Gly Phe 


Asp 


Thr Val 


Tyr 


Ala 


Meb 


Ala 




180 




185 






190 






Asp 


Arg Glu Asp 


Asp Arg 


Arg He Gly 


Val 


Asn Ser 


Ser 


Ala 


Leu 


Phe 




195 




200 






205 








Phe 


Gly Gin Tyr 


Val Gly 


Glu Ala Val 


Gly 


He Phe 


Phe 


Ala 


Leu 


Thr 




210 




215 


220 










He 


Gly Cys Leu 


Phe Tyr 


Leu Gly Meb 


He 


Leu Meb 


Leu 


Asn 


Pro 


Leu 


225 


230 






235 








240 


Tyr 


Trp Leu Ser 


Leu Ala 


He Ala He 


Val 


Gly Trp 


Val 


He 


Gin 


Tyr 




245 




250 








255 




He 


Gin Leu Ser 


Ala Pro 


Thr Pro Glu 


Pro 


Lys Leu 


Tyr 


Gly 


Gin 


He 




260 




265 






270 






Phe 


Gly Gin Asn 


Val He 


He Gly Phe 


Val 


Leu Leu 


Ala 


Gly 


Meb 


Leu 




275 




280 






285 









Leu Gly Trp Leu 
290 



<210> 33 
<211> 316 
<212> PRT 

<213> Synechocysbis sp 
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<400> 33 

Met Val Thr Ser Thr Lys He His Arg Gin His Asp Ser Met Gly Ala 

15 10 15 

Val Cys Lys Ser Tyr Tyr Gin Leu Thr Lys Pro Arg lie He Pro Leu 

20 25 30 

Leu Leu He Thr Thr Ala Ala Ser Met Trp He Ala Ser Glu Gly Arg 

35 .40 45 

Val Asp Leu Pro Lys Leu Leu He Thr Leu Leu Gly Gly Thr Leu Ala 

50 . 55 60 

Ala Ala Ser Ala Gin Thr Leu Asn Cys He Tyr Asp Gin Asp He Asp 
65 70 75 80 

Tyr Glu Met Leu Arg Thr Arg Ala Arg Pro He Pro Ala Gly Lys Val 

85 90 95 

Gin Pro Arg His Ala Leu He Phe Ala Leu Ala Leu Gly Val Leu Ser 

100 105 110 

Phe Ala Leu Leu Ala Thr Phe Val Asn Val Leu Ser Gly Cys Leu Ala. 

115 120 125 

Leu Ser Gly He Val Phe Tyr Met Leu Val Tyr Thr His. Trp Leu Lys 

130 135 140 

Arg His Thr Ala Gin Asn He Val He Gly Gly Ala Ala Gly Ser He 
145 150 155 160 

Pro Pro Leu Val Gly Txrp Ala Ala Val Thr Gly Asp Leu Ser Trp Thr 

165 170 175 

Pro Trp Val Leu Phe Ala Leu He Phe Leu Trp Thr Pro Pro His Phe 

180 185 190 

Trp Ala Leu Ala Leu Met He Lys Asp Asp Tyr Ala Gin Val Asn Val 

195 200 205 

Pro Met Leu Pro Val. He Ala Gly Glu Glu Lys Thr Val iSer Gin He 

210 215 220 

Trp Tyr Tyr Ser Leu Leu Val Val Pro Phe Ser Leu Leu Leu Val Tyr 
225 230 235 240 

Pro Leu His Gin Leu Gly lie Leu Tyr Leu Ala He Ala He He Leu 

245 250 255 

Gly Gly Gin Phe Leu Val Lys Ala Trp Gin Leu Lys Gin Ala Pro Gly 

260 265 270 

Asp Arg Asp Leu Ala Arg Gly Leu Phe Lys Phe Ser He Phe Tyr Leu 

275 280 285 

Met Leu Leu Cys Leu Ala Met Val He Asp Ser Leu Pro Val Thr His 

290 295 300 

Gin Leu Val Ala Gin Met Gly Thr Leu Leu Leu Gly 
305 310 315 

<210> 34 
<211> 324 
<212> PRT 

<213> Synechocystis sp 
<400> 34 

Met Ser Asp Thr Gin Asn Thr Gly Gin Asn Gin Ala Lys Ala Arg Gin 

1 5 10 15 

Leu Leu Gly Met Lys Gly Ala Ala Pro Gly Glu Ser Ser He Trp Lys 

20 25 30 

He Arg Leu Gin Leu Met Lys Pro He Thr Trp He Pro Leu He Trp 

35 40 45 

Gly Val Val Cys Gly Ala Ala Ser Ser Gly Gly Tyr He Trp Ser Val 

50 55 60 

Glu Asp Phe Leu Lys Ala Leu Thr Cys Met Leu Leu Ser Gly Pro Leu 
65 70 75 80 

Met Thr Gly Tyr Thr Gin Thr Leu Asn Asp Phe Tyr Asp Arg Asp He 
85 90 95 
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Asp Ala lie Asn Glu Pro Tyr Arg Pro He Pro Ser Gly Ala He Ser 

iOO 105 110 

Val Pro Gin Val Val Thr Gin He Leu He Leu Leu Val Ala Gly He 

115 120 125 

Gly Val Ala Tyr Gly Leu Asp Val Trp Ala Gin His Asp Phe Pro He 

130 135 140 

Met Met Val Leu Thr Leu Gly Gly Ala Phe Val Ala Tyr He Tyr Ser 
145 150 155 160 

Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Leu Gly Asn Tyr Ala 

165 170 175 

Leu Gly Ala Ser Tyr He Ala Leu Pro Trp Trp Ala Gly His Ala Leu, 

180 185 190 

Phe Gly Thr Leu Asn Pro Thr He Met Val Leu Thr Leu He Tyr Ser 

195 200 . 205 

Leu Ala Gly Leu Gly He Ma Val Val Asn Asp Phe Lys Ser Val Glu 

210 215 220 

Gly Asp Arg Gin Leu Gly Leu Lys Ser Leu Pro Val Met Phe Gly He 
225 230 235 240 

Gly Thr Ala Ala Trp He Cys Val He Met He Asp Val Phe Gin Ala 

245 250 255 

Gly He Ala Gly Tyr Leu He Tyr Val His Gin Gin Leu Tyr Ala Thr 

260 265 270 

He Val Leu Leu Leu Leu He Pro Gin He Thr Phe Gin Asp Met Tyr 

275 280 285 

Phe Leu Arg Asn Pro Leu Glu Asn Asp Val Lys Tyr Gin Ala Ser Ala 

290 295 300 

Gin Pro Phe Leu Val Phe Gly Met Leu Ala Thr Gly Leu Ala Leu Gly 
305 310 315 320 

His Ala Gly He 



<210> 35 
<211> 307 
<212> PRT 

<213> Synechocystis sp 
<400> 35 



Met 


Thr 


Glu 


Ser 


Ser Pro 


Leu Ala 


Pro 


Ser Thr Ala Pro Ala Thr Arg 


1 








5 






10 15 


Lys 


Leu 


Trp 


Leu 


Ala Ala 


He Lys 


Pro 


Pro Met Tyr Thr Val Ala Val 








20 






25 


30 


Val 


Pro 


He 


Thr 


Val Gly Ser Ala 


Val 


Ala Tyr Gly Leu Thr Gly Gin 






35 






40 




45 


Trp 


His 


Gly 


Asp 


Val Phe 


Thr He 


Phe 


Leu Leu Ser Ala He Ala He 


50 




55 




60 


He 


Ala 


Trp 


He 


Asn Leu 


Ser Asn 


Asp 


Val Phe Asp Ser Asp Thr Gly 


65 






70 






75 80 


He 


Asp 


Val 


Arg 


Lys Ala 


His Ser 


Val 


Val Asn Leu Thr Gly Asn Arg 






85 






90 95 


Asn 


Leu 


Val 


Phe 


Leu He 


Ser Asn 


Phe 


Phe Leu Leu Ala Gly Val Leu 








100 






105 


110 


Gly 


Leu 


Met 


Ser 


Met Ser Trp Arg Ala 


Gin Asp Trp Thr Val Leu Glu 




115 






120 




125 


Leu 


He 


Gly 


Val 


Ala He 


Phe Leu Gly 


Tyr Thr Tyr Gin Gly Pro Pro 




130 






135 




140 


Phe 


Arg 


Leu 


Gly 


Tyr Leu Gly Leu Gly 


Glu Leu He Cys Leu He Thr 


145 






150 






155 160 


Phe 


Gly 


Pro 


Leu 


Ala He 


Ala Ala 


Ala 


Tyr Tyr Ser Gin Ser Gin Ser 








165 






170 175 


Phe 


Ser 


Trp 


Asn 


Leu Leu 


Thr Pro 


Ser 


Val Phe Val Gly He Ser Thr 






180 






185 


190 
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Ala He He Leu Phe Cys Ser His Phe His Gin Val Glu Asp Asp Leu 

195 200 205 

Ala Ala Gly Lys Lys Ser Pro He Val Arg Leu Gly Thr Lys Leu Gly 

210 215 220 

Ser Gin Val Leu Thr Leu Ser Val Val Ser Leu Tyr Leu He Thr Ala 
225 230 235 240. 

He Gly Val Leu Cys His Gin Ala Pro Trp Gin Thr Leu Leu He He 

245 250 255 

Ala Ser Leu Pro Trp Ala Val Gin Leu He Arg His Val Gly Gin Tyr 

260 265 270 

His Asp Gin Pro Glu Gin Val Ser Asn Cys Lys Phe He Ala Val Asn 

275 280 285 

Leu His Phe Phe Ser Gly Met Leu Met Ala Ala Gly Tyr Gly Trp Ala 

290 295 300 

Gly Leu Gly 
305 

<210> 36 
<211> 927 
<212> DNA 

<213> Synechocystis sp 
<400> 36 

atggcaacta tccaagcttt ttggcgcttc tcccgccccc ataccatcat tggtacaact 
60 

ctgagcgtct gggctgtgta tctgttaact ahtctcgggg atggaaactc agttaactcc 
120 

cctgcttccc tggatttagt gttcggcgct tggctggcct gcctgttggg taatgtgtac 
180 

attgtcggcc tcaaccaatt gtgggatgtg gacatbgacc gcatcaataa gccgaatttg 
240 

cccctagcta acggagattt ttctatcgcc cagggccgtt ggattgtggg actttgtggc 
300 

gttgcfctcct tggcgatcgc ctggggatta gggctatggc tggggctaac ggtgggcatt 
360 

agtttgatta ttggcacggc ctattcggtg ccgccagtga ggttaaagcg cttttccctg 
420 

ctggcggccc tgtgtattct gacggtgcgg ggaattgtgg ttaacttggg cttattttta 
480 

ttttttagaa ttggtttagg ttatcccccc actttaataa cccccatctg ggttttgact 
540 

ttatttatct tagttttcac cgtggcgatc gccattttta aagatgtgcc agatatggaa 
600 

ggcgatcggc aatttaagat tcaaacttta actttgcaaa tcggcaaaca aaacgttttt 
660 

cggggaacct taattttact cactggttgt tatttagcca tggcaatctg gggcttatgg 
720 

gcggctabgc ctttaaatac tgctttcttg attgtttccc atttgtgctt attagcctta 
780 

ctctggtggc ggagtcgaga tgtacactta gaaagcaaaa ccgaaattgc tagtttttat 
840 

cagtttattt ggaagctatt tttcttagag tacttgctgt atcccttggc tctgtggtta 
900 

cctaattttt ctaatactat tttttag 
927 

<210> 37 
. <211> 308 
<212> PRT 

<213> Synechocystis sp 
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<400> 37 



Met Ala 


Thr 


ile 


Gin 


Ala 


Phe Trp Arg 


phe Ser Arg 


Pro 


His 


Thr 


Ile 


1 






5 




10 






15 




lie Gly 


Thr 


Thr 


Leu 


Ser 


Val Trp Ala 


Val Tyr Leu 


Leu 


Thr 


lie 


Leu 






20 






25 




30 






Glv Asp 


Gly 


Asn 


Ser 


Val 


Asn Ser Pro 


Ala Ser Leu 


Asp 


Leu 


Val 


Phe 




35 








40 




45 








Gly Ala 


TcD 


Leu 


Ala 


Cvs 


Leu Leu Gly 


Asn Val Tyr 


Ile 


Val 


Glv 


Leu. 


50 










55 


60 










Asn Gin 


Leu 


Tro 


Asp 


Val 


Asp lie Asp 


Arg Ile Asn 


Lvs 


Pro 


Asn 


Leu 


65 








70 




75 








80 


Pro Leu 


Ala 


Asn 


Glv 


Asp 


Phe Ser Ile 


Ala Gin Glv 


Ara 


Trp 


lie 


Val 








85 






90 




95 




Glv Leu 


Cys 


Glv 


Val 


Ala 


Ser Leu Ala 


lie Ala Tro 


Glv 


Leu 


Glv 


Leu 






100 






105 






110 






Tjcp Leu 


Glv 


Leu 


Thr 


Val 


Glv Ile Ser 


Leu Tie Tie 


Glv 
«xy 


Thr 


Ala 


Tvr 




115 








120 




125 








Ser Val 


Pro 


Pro 


Val 


Ax9 


Leu Lvs Ara 


Phe Seir L^u 


Leu 


Ala 


Ala 


Leu 


130 








135 


140 










Cys lie 


Leu 


Thr 


Val 


Atq 


Glv Tie Val 


vox noil 


Glv 

wX jr 


Leu 


phe 


Leu 


145 








150 




155 






160 


Pt^e Phe 




Tie 
X X cr 


V3X y 


Leu 


\3JLjf UJLSJ 


Prrt Thr Tioii 

fcXU XllX UcU 


Tip 
xxc 


Thr 

•L IIX 


Pro 
rx \J 


Tl*> 








165 






170 






1 7S 

X ' J 




i. 1.^ vox 


- 

lieu 


Thr 


Leu 




Tip T.#»n Val 

XXC JJCU vox 


Phd Thr Val 

CIIC XllJL VOX 


AT a 


Tl A 

X JLC 


2X1 a 
nxo 


Tip 






1 fln 






185 






1 QO 

X ? V 






cue uy^ 


Asp 


V ajL 


Fro 


- 

ASp 


u V3XU i7xy 


rvojp Ax^ \3xn 


PhA 
c lie 


Lys 


Tl *» 

XXV 






JL ^ J 




















±11*. ijcU 


1 


Leu 


Vjxn 


XXc 


oxy xiys laxn 


Tien Val 

nan Vox irnc 


Arg 


oxy 


Th r 

1 nx 


Leu 












^x^ 












lie Leu 


Leu 




oxy 


\»ty o 


xyx xttsu MXa 


Mo t- 211 a T 1 o 
C r\X o X X c 


Trp 


wxy 


- 

xieu 


Trp 


225 




















240 


Ala Ala 


Met 


Pro 


Leu 


Asn 


Thr Ala Phe 


Leu Ile Val 


Ser 


His 


Leu 


Cys 








245 






250 






255 


Leu Leu 


Ala 


Leu 


Leu 


Trp 


Trp Arg Ser 


Arg Asp Val 


His 


Leu 


Glu 


Ser 






260 






265 






270 






Lys Thr 


Glu 


Ile 


Ala 


Ser 


Phe Tyr Gin 


Phe lie Trp 


Lys 


Leu 


Phe 


Phe 




275 








280 




285 








Leu Glu 


Tyr 


Leu 


Leu 


Tyr 


Pro Leu Ala 


Leu Trp Leu 


Pro 


Asn 


Phe 


Ser 


. 290 






295 


300 










Asn Thr 


lie 


Phe 



















305 



<210> 38 
<2H> 1092 
<212> DNA 

<213> Synechocystis sp 
<400> 38 

atgaaatttc cgccccacag tggttaccat tggcaaggtc aatcaccttt ctttgaaggt 
60 

tggtacgtgc gcctgctttt gccccaatcc ggggaaagtt ttgcttttat gtactccatc 
120 

gaaaatcctg ctagcgatca tcattacggc ggcggtgctg tgcaaatttt agggccggct 
180 

acgaaaaaac aagaaaatca ggaagaccaa cttgtttggc ggacatttcc ctcggtaaaa 
240 

aaattttggg ccagtcctcg ccagtttgcc ctagggcatt ggggaaaatg tagggataac 
300 

aggcaggcga aacccctact ctccgaagaa ttttttgcca cggtcaagga aggttatcaa 
360 . 

atccatcaaa atcagcacca aggacaaatc attcatggcg atcgccattg tcgttggcag 
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;f..c.,U, »cc„a.,t .ac«,„,, .,«CU,CC ,,«tcctc,. ,,c»c,,c, . 
:rtt,9c.« conduce =tt,t«,.t ccc„»„c ,..UC«« .,cccaa„ 
=^2,c,cac, ,c.„«,.. at,,c.,a,, ,..ca,Ut, aa««ac=a CCCCU," 

SU,"' """"" ■ 

?r»«cct, .cc.«ca,, act,.,c,tc act,=c,ct, ,c9,«..=. ' 
',^?c,=ccc, aa,.„U,c »taa«„c «acatcacc a,„t,at« «ac,.a«t 
r,U«ce a.„„c.,. c.c«,,caa ,U,ctccc. „„cc,«, ,caa«.a,. 
Tca^caat, aU„Utt, ,,tca.,«, tce,,aaaaa =.,.ta.aaa .„c.,t ta 
rcccctc ccoccooa ,,,e«ac.a ctc.ac.,cc ,.=.ta=cae ta,„,cUt 
??°taUt,c aatt„gatc t,t,,,tcac „cct,ata, t,ca.,„,. aacaccc 
S?„cta, aa,tt„a„ t,.«,„^ ttaaea,a„ a..a«t,a, caaaaaaac. , 
1080 

gtgccattct ga 
1092 

<210> 39 
<211> 363 
<212> PRT 

<213> Synechocystis sp 

S:;Vs%h. pro His se. OX, Tyr Hia ..P =1. 01" 
P.a OX, tIp TV. va. .|U ^« ... P« - - 
S„ ^pS. H,t TV. ;ja .lu Pro .la S,r ^ Hia ^a 

0.V Ua v.. 0|n U» .X, Pro -r Lya Lys .Xn 
CX„ =X„ =X« »ap oxn U« V.X Trp «, Thr Pha Pro S.r V.X JJ. 
Sa P^a Trp >aa sar ^ro ^, OX. PK. JX. OX, Hia Trp OX, ,a 
C,a ^ «p 0X» .Xa .,a Pro X.- S.r OXo 0X„ P.. e 

100 . iV; cm San Gin His GXn GXy 

Ma Thr vaX lya Ola OX, T,r 01„ Ila His X25 ^ ^ , _ 

GXn XX, His Oly .ap .r, Sfs Cys ^, Trp Oln P.. ^ VaX OX. 

V.X T.r ,rp OX, Pro P.| Pro -a T.r 5^ 

^„ S.r P.a S pro La. P.= «P Pro OX, .rp 0X„ n. . » 
J 0X„ OX, )ll .la Hi. Oly Trp iaS .ys Trp oln 0 . 1. 
,y, 0x0 P.e Z Kis v;j i'y? «a OX- ^rp OX, Hia 

S sar ,rp PJ. fr? ... OXa .X, T,r P.. ro « 
His S° OXy L.. S.r val T.'r AXa OXy Oly OX. XXa V.X X.. 
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225 230 . 235 240 



Gly Arg 


Pro 


Glu 


Glu 


Val 


Ala 


Leu 


lie 


Gly 


Leu 


His 


His 


Gin 


Gly 


Asn 










245 










250 










255 




Phe 


Tyr 


Glu 


Phe 


Gly 


Pro 


Gly 


His 


Gly 


Thr 


Val 


Thr 


Trp 


Gin 


Val 


Ala 








260 










265 








270 






Pro 


Trp 


Gly 
275 


Arg 


Trp 


Gin 


Leu 


Lys 
280 


Ala 


Ser 


Asn 


Asp 


Arg 
285 


Tyr 


Trp 


Val 


Lys 


Leu 
290 


Ser 


-Gly 


Lys 


Thr 


Asp 
295 


Lys 


Lys 


Gly 


Ser 


Leu 
300 


Val 


His 


Thr 


Pro 


Thr 


Ala 


Gin 


Gly 


Leu 


Gin 


Leu 


Asn 


Cys 


Arg 


Asp 


Thr 


Thr 


Arg 


Gly 


Tyr 


305 










310 










315 










320 


Leu Tyr 


Leu 


Gin 


Leu 


Gly 


Ser 


Val 


Gly 


His 


Gly 


Leu 


lie 


Val 


Gin 


Gly 










325 










330 








335 




Glu 


Thr 


Asp 


Thr 
340 


Ala 


Gly 


Leu 


Glu 


Val 
345 


Gly 


Gly 


Asp 


Trp 


Gly 
350 


Leu 


Thr 


Glu 


Glu 


Asn 
355 


Leu 


Ser 


Lys 


Lys 


Thr 
360 


Val 


Pro 


Phe 













<210> 40 
<211> 56 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 40 

cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgcag ggcgcgccat ttaaat 
56 

<210> 41 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 41 

tcgaggatcc gcggccgcaa gcttcctgca gg 
32 

<210> 42 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 42 

tcgacctgca ggaagcttgc ggccgcggat cc 
32 

<210> 43 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
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<400> 43 

tcgacctgca ggaagcttgc ggccgcggat cc 
32 

<210> 44 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Descripbion of Artificial Sequence: Oligonucleotide 
<400> 44 

tcgaggatcc gcggccgcaa gcttcctgca gg 
32 

<210> 45 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 45 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 
36 

<210> 46 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 46 

cctgcaggaa gcttgcggcc gcggatcc 
28 

<210> 47 . 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

. <223> Description of Artificial Sequence: Oligonucleotide 
<400> 47 

tcgacctgca ggaagcttgc ggccgcggat ccagct 
36 

<210> 48 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
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<400> 48 

ggatccgcgg ccgcaagctb cctgcagg 
28 

<210> 49 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 49 

gatcacctgc aggaagcttg cggccgcgga tccaatgca 
39 

<210> 50 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 50 

ttggatccgc ggccgcaagc ttcctgcagg t 
31 

<210> 51 
<211> 41 
<212> DNA 

<2a3> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 51 

ggatccgcgg ccgcacaatg gagtctctgc tctctagttc t 
41. 

<210> 52 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 52 

ggatcctgca ggtcacttca aaaaaggtaa cagcaagt 
38 

<210> 53 
<211> 45 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 53 
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ggatccgcgg ccgcacaatg gcgttfctttg ggctctcccg tgttt 
45 

<210> 54 
,<211> 40 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 54 

ggatcctgca ggttattgaa aacttcttcc aagtacaact 
40 

<210> 55 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; Oligonucleotide 
<400> 55 

ggatccgcgg ccgcacaatg tggcgaagat ctgttgtt 
38 

<210> 56 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 56 

ggatcctgca ggtcatggag agtagaagga aggagct 
37 

<210> 57 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 57 

ggatccgcgg ccgcacaatg gtacttgccg aggttccaaa gcttgcctct 
50 

<210> 58 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 58 

ggatcctgca ggtcacttgt ttctggtgat gactctat 
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38 . 

<210> 59 
<211> 38 
<212> DNA 

<213> Arbificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 59 

ggatccgcgg ccgcacaatg acttcgattc tcaacact 
38 

<210> 60 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 60 

ggatcctgca ggtcagtgtt gcgatgctaa tgccgt 
36 

<210> 61 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 61 

taatgtgtac attgtcggcc tc 
22 

<210> 62 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: . Oligonucleotide 
<400> 62 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt ccacaattcc ccgcaccgtc 
60 . 

<210> 63 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 63 

aggctaataa gcacaaatgg ga 
22 
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<210> 64 
<21I> 63 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of /Artificial Sequence: Oligonucleotide 
<400> 64 

ggtatgagtc agcaacacct tcttcacgag gcagacctca. gcggaattgg tttaggttat 

60 

ccc 

63 

<210> 65 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 

<400> 65 . 
ggatccatgg ttgcccaaac cccatc 
26 

<210> 66 
<211> 61 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 66 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt gggtaagcaa caatgaccgg 
60 

c 6 
1 

<210> 67 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 67 

gaattctcaa agccagccca gtaac 
25 

<210> 68 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
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<400> 68 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgggtgcga aaagggtfcbt 

60 

ccc 

63 

<210> 69 
<211> 23 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 69 

ccagtggttt aggctgtgtg gtc 
23 

<210> 70 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 70 

ctgagttgga tgtattggat c 
21 

<210> 71 . 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<4p0> 71 

ggatccatgg ttacttcgac aaaaatcc 
28 

<210> 72 
<211> 60 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 72 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt gctaggcaac cgcttagtac 
60 

<210> 73 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
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<400> 73 

gaattcbtaa cccaacagta aagttccc 
28 

<210> 74 
<211> 63 
<212> DNA . 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 74 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgccggcat tgtcthttac 

60 . 

atg 

63 

<210> 75 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 75 

ggaacccttg cagccgcttc 
20 

<210> 76 
<2H> 22 ^ 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 76 

gtatgcccaa ctggfcgcaga gg 
22 

<210> 77 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 77 

ggatccatgt ctgacacaca aaataccg 
28 

<210> 78 
<211> 62 
<212> DNA 

<213> Artificial Sequence 
<220> . 
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<223> Description of Artificial Sequence: Oligonucleotide 
<400> 78 

gcaatgtaac ahcagagatt ttgagacaca acgtggcttt cgccaatacc agccaccaac 
60 

<210> 79 . 
<211> 27 . 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 79 

gaattctcaa atccccgcat ggcctag 
27 

<2ia> 80 
<211> 65 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 80 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggcctacg gcttggacgt 
60 

gtggg 

65 

<210> 81 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 81 

cacttggatt cccctgatct g 
21 

<210> 82 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 82 

gcaatacccg cttggaaaac g 
21 

<210> 83 

<211> 29 ' 

<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 83 

ggatccatga ccgaatcttc gcccctagc 
29 

.<210> 84 
<211> 61 
<2I2> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 84 " 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt caatcctagg tagccgaggc 
60 



<210> 85 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 85 

gaattcttag cccaggccag cccagcc 
27 

<210> 86 
<211> 66 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 86 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggggaatt gatttgttta 
60 

attacc 
66 

<210> 87 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: -Oligonucleotide 
<4 00> 87 

gcgatcgcca ttatcgcttg g 
21 
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<210> 88 . 
<211> 24 
<212> ONA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 88 

gcagactggc aattatcagt aacg 
24 

<210> 89 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 89 . . 

ccatggattc gagtaaagtt gtcgc 
25 

<210> 90 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 90 

gaattcactt caaaaaaggt aacag 
25 

<210> 91 
<211> 4550 
<212> DNA 

<213> Arabidopsis sp 
<400> 91 

attttacacc aatttgatca cttaactaaa ttaattaaat tagatgatta tcccaccata 
60 

tttttgagca ttaaaccata aaaccatagt tataagtaac tgttttaatc gaatatgact 
120 

cgattaagat taggaaaaat ttataaccgg taattaagaa aacattaacc gtagtaaccg 
180 

taaatgccga ttcctccctt gtctaaaaga cagaaaacat atattttatt ttgccccata 
240 

tgtttcactc tatttaattt caggcacaat acfctttggtt ggtaacaaaa ctaaaaagga 
300 

caacacgtga tacttttcct cgtccgtcag tcagattttt tttaaactag aaacaagtgg 
360 

caaatctaca ccacattttt tgcttaatct attaacttgt aagttttaaa ttcctaaaaa 
420 

agtctaacta attcttctaa tataagtaca ttccctaaat ttcccaaaaa gtcaaattaa 
480 

taattttcaa aatctaatct aaatatctaa taattcaaaa tcattaaaaa gacacgcaac 
540 

aatgacacca attaatcatc ctcgacccac acaattctac agttctcatg ctaaaccata 



35/44 



wo 02/35060 PCTAJSOl/42673 

600 . 

ttttbtgctc tcfcgtfccctt caaaatcatt tcthtctctt ctttgattcc caaagatcac 
660 . 

ttctttgtct ttgatttttg atttfcfctttc tctctggcgt gaaggaagaa gctttatttc 
720 

atggagfccfcc tgctctctag ttcttcfcctt gtttccgchg gtaaatctcg tccttbtctg 
780 

gtttcaggtt ttat'ttgttg hfctaggtttc gtttttgbga ttcagaacca tacaaaaagt 
840 

ttgaactttt ctgaatataa aataaggaaa aagtttcgat ttttataatg aahtgtttac 
900 

fcagatcgaag taggtgacaa aggttattgt gtggagaagc ataatttctg ggcttgacfct 
960 

tgaattttgt ttctcatgca tgcaacttat caatcagctg gtgggttbtg bbggaagaag 
1020 

cagaabctaa agctccactc ttbatcaggb tcgtbagggb bbbabgggbb bbbgaaabba 
1080 

aatacbcaab cabcttagbc bcatbabbcb abbggbtgaa bcacattbbc baatbbggaa 
1140 

bbtabgagac aabgbabgbt ggacbbagbt igaagtbcbtc bctbbggbba bagbbgaagb 
1200 

gbtacbgabg ttgtbbagcb cbbbacacca ababatacac ccaabbbbgc agaaatccga 
1260 

gbtcbgcgtb gtgabbcgag baaagttgbc gcaaaaccga agbttaggaa caatcbbgbb 
1320 

aggcctgabg gtcaaggabc bbcabbgbtg bbgbabccaa aacataagtc gagabbbcgg 
1380 

gbtaabgcca cbgcgggtca gccbgaggcb bbcgacbcga abagcaaaca gaagtcbttb 
1440 

agagacbcgb tagabgcgtb btacaggbtt bcbaggccbc abacagtbab bggcacagtb 
1500 

aagttbcbcb ttaaaaabgt aacbcbbbta aaacgcaabc tbbcagggtb btcaaggaga 
1560 

baacabbagc tcbgbgattg gabbbgcagg tgcbbagcab bbbatcbgta bctbbcbtag 
1620 

cagbagagaa ggtbtctgat atabcbccbb bacbtbbcac bggcabcttg gaggtaatga 
1680 

abababaaca cataatgacc gabgaagaag abacabttbb bbcgtctctc tqtttaaaca 
1740 

abbgggbbbb gtbbbcaggc bgbbgbbgca gcbcbcatga bgaacattba catagbbggg 
1800 

cbaaatcagb tgbcbgabgb bgaaabagab aaggbaacab gcaaatbtbc btcatatgag 
1860 

bbcgagagac tgatgagabt aabagcagcb agtgccbaga bcabcbcbab gbgggttbtb 
1920 

gcaggbbaac aagcccbatc btccabbggc abcaggagaa batbcbgtta acaccggcab 
1980 

bgcaatagba gcbbccbbcb ccabcabggb abggbgccab bttcacaaaa tbtcaacbbb 
2040 

tagaatbcba baagbbacbg aaabagtbbg tbabaaabcg bbabagagtb bctggctbgg 
2100 

gbggabbgbb ggbbcatggc cabbgbtctg ggcbcbbbbb gbgagbbtca bgctcggbac 
2160 

bgcabacbct abcaabgbaa gbaagbbbcb caabacbaga abbbqgcbca aatcaaaabc 
2220 

bgcagbbbcb agbbbbaggb baabgaggbfc bbaabaacbb acbbctacba caaacagbbg 
2280 

ccacbbbbac ggbggaaaag abbbgcabbg gbbgcagcaa bgbgbabccb cgcbgbccga 

gcbabbabbg tbcaaabcgc cbbbbabcba cababbcagg bacbaaacca bbbbccbbab 
2400 
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gttfcbgtagt tgttttcatc aaaatcactt ttatafctact aaagctgtga aactttgttg 
2460 

cagacacatg tghtbggaag accaafccttg ttcacfcaggc ctcttatttfc cgccactgcg 
2520 

tttatgagct fctttcfcctgt cgttattgca ttgtthaagg taaacaaaga tggaaaaaga 
2580 

fctaaafcctat gfcatacttaa agtaaagcat tctactgtha ttgatgagaa gttttctttt 
2640 

ttggttggat gcaggatata cctgatatcg aaggggataa gatabtcgga atccgatcat 
2700 

tctctgtaac tctgggtcag aaacgggtac gahatctaaa ctaaagaaat tgttttgact 
2760 

caagtgttgg attaagatta cagaagaaag aaaactgttt ttgtfctcbtg caaaabtcag 
2820 

gbgbtbbgga cabgbgbtac acbacbtcaa atggcbbacg ctgbtgcaab bctagbbgga 
2880 

gccacatcbc cattcatatg gagcaaagbc atcbcggtaa caatcbbtcb btacccabcg 
2940 

aaaacbcgcb aabbcabcgt ttgagbggba ctggbttcab bttgbbccgt bcbgtbgabt 
3000 

tttbtbcagg btgbgggtca tgbbabacbc gcaacaacbt bgtgggctcg agcbaagtcc 
3060 

gtbgabctga gtagcaaaac cgaaabaact tcabgttata bgttcabatg gaaggbbaga 
3120 

bbcgbbbata aatagagtct ttacbgccbt bbbabgcgct ccaatbbgga abtaaaabag 
3180 

ccbbbcagtb bcabcgaabc accabbatac bgabaaattc hcatbtcbgc abcagctcbb 
3240 

bbabgcagag tacbtgctgt taccbttttt gaagbgactg acabbagaag agaagaagat 
3300 

ggagabaaaa gaataagtca tcacbatgct tctgtbttta ttacaagtbc atgaaattag 
3360 

gtagtgaacb agtgaatbag agtbtbabbc tgaaacatgg cagacbgcaa aaatatgbca 
3420 

aagatatgaa tbbctgtbgg gtaaagaagt cfccbgcttgg gcaaaafccbb aaggttcggb 
3480 

gtgbtgabat aatgctaagc gaagaaatcg atbcbatgba gaaatbtccg aaacbabgbg 
3540 

baaacabgbc agaacatctc cattctabab cbbcbtctgc aagaaagcbc bgttbbbatc 
3600 

accbaaactc tttabctctg tgbagttaag abatgbatat gbacgbgact acatttttbb 
3660 

gtbgabgtaa tbbgcagaac gtabggattt btgtbagaaa gcatgagttc gaaagbatat 
3720 

gtbtatabab atggabaatt cagaccbaac gtcgaagcbc acaagcataa attcactacb 
3780 

abagttbgcb cbgtaataga tagbbccabt gatgbcbbga aactgbacgt aacbgcctgg 
3840 

gcgtbbbgbg gbtgabactg acbactgagt gbtcbfcbgtg agtgttgtaa gbabacaaga 
3900 

agaagaabat aggcbcacgg gaacgactgt ggbggaagat gaaatggaga tcabcacgba 
3960 

gcggcbbtgc caaagaccga gbcacgabcg agtcbatgaa gbctbtacag cbgctgatba 
4020 

bgabbgacca ttgcbbagag acgcabbgga atcbbactag ggacbbgccb gggagbbbcb 
4080 

bcaagbacgb gbcagabcat acgatgbagg agabbbcacg gcfctbgabgt gbbtgbbbgg 
4140 . . 

agbcacaabg cbbaabgggc bbabbggccc aataabagcb agcfccbbtbg cbbtagccgb 
4200 

bbcgbbbgbc ccctggbggt gagbabbatt agggbabggb gtgaccaaag tcaccagacc 
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4260 

tagagtgaat ctagtagagt cctagaccat ggtccatggc ttttatttgt aatttgaaaa 
4320 

atgaacaatt ctttttgtaa ggaaaacttt hahatagtag acgtttacta tafcagaaact 
4380 

agfctgaacta acttcgtgca attgcataat aatggtgtga aatagagggt gcaaaactca . 
4440 

abaaacafctt cgacghacca agaghtcgaa acaataagca aaatagattt ttttgcttca 
4500 

gactaabttg tacaatgaab ggttaabaaa ccabbgaagc bbbtatbaab 
4550 

<210> 92 
<211> 4450 
<212> DNA 

<213> Arabidopsis sp 
<400> 92 

tbtaggbfcac aaaabcaabg atattgcgba tgtcaacbat aaaagccaaa agbaaagccb 
60 

cttgbbbgac cagaaggtca tgatcattgb abacabacag ccaaactacc bcctggaaga 
120 . 

aaagacabgg abcccaaaca acaacaabag cbbcbbbbac aagaaccagt agtaactagb 
180 

cactaabcta aaagagtbaa gbttcagctb bbcbggcaab ggctccbbga bcatbbcaat 
240 

ccbgaaggag acccacbbbg tagcaagacc atgbccbctg tbtcactbac agbgbgbcbc 
300 

aaaagtcbac btcaabbctb catatabagg bbccbcacac bacagcbtca tcctcabbcg 
360 

bbgacagaga gagagbcttt attgaaaacb bcbbccaagt acaacbccac baaatabaab 
420 

agcaccaaac cactbgbbcg acacaaatct gtacagabab aaaaacacta bbaggtbbtc 
480 

caaggcaaab cacabaabbg gattgbgaaa gagbacaaaa gabaaaccca aabtbtcaba 
540 

. cttbcbacbg cagtcagcac cagabgataa gtcagcfcgbc ccbatttgcc abccbaacbg 
600 

bccbgabgca gcggccagtg abgcgbaaba tbgccaccct baabcabbag agcgagaaac 
660 

aaaaagaabc aaaagacagb aaabggaatb aggaabcaca aatgagbcct tgbaaagbbb 
720 

atbgagbacc gagabctgca ctgaabccag aaagbgcaag aaaaccbatg gatgcbgbgc 
780 

caaabccagb baaccaaagc bbtgbabtab caccgaabcb aagggctgtt gacttaacac . 
840 

caacbbbbac abcabcbbcb bbgtccbgga gacacaatab abtagacatt agbccabgga 
900 

aaaaaaatga bbbaaccbag aababcbcaa aatbacbtgc abaaaaacbg aacbbgagcb 
960 

gaaabbbbgg gtbcgbagct bgbggcatab acbabbtcab bbtcaabggg ccacaaaggb 
1020 

aacbbbcbbb bctcacbtcb gbtgcaaacg ggaagacbtb tabggggcba acbctbcact 
1080 

baaagbabag aaabcagabg gaaaaggbgg gagabcaggg baabbbbcbt ctfcbabgabb 
1140 

gacaaaagbc gaacabcgaa abggatgcab bbgcabgaga cabgaaacaa aagcbgaaaa 
1200 

agaaabcbgb ggbggbgaag cbagaaaaag aaaacaaagc aagcaababg cacacabbga 
1260 

gabbaacbac bbbgcbacbg gbcataatca aabagabbbb gaagcbaaaa aabaaaaagb 
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1320 

gaatataccfc gabgtgcata aabagtatca baaacaaggg tccagcagac tccggagaga 
1380 

tagagaggga gtacaataga tggtgctatg cbtcctthaa cfcgcagtcca tcctaacaat 
1440 

gctccccagt ttafcggtcaa accfcaaaaag gcttgaggct gcaattataa aaacgaatca 
1500 

atcataagaa aatcagaaaa tatataatgt ctaacfcttga gaagccagaa bagattfcaaa 
1560 

ttacccaaaa kgtaaacctc ttcabaagtg ggtaggaaaa gacaagtaac aaagabgaag 
1620 

ccccbaaaac acggcbgcag aababacata cbgaaabgag cbcaagbaga aaagaabbtg 
1680 

abcacaaaac baaagacaag accbgagaac ababcbbcag aabbbgggcc aacbacataa 
1740 

gggbgaacca babgbgbabg bgaabbbbta aacaaacacb bgcaaabacg cgacbbbagg 
1800 

gcaagbaaaa aabccaaaca aaccbgtaab bgbbaagbbg gagaagaabc cctaagccba 
1860 

aaagcaactg cagcccgaga aabccaabcc cbbgaaatgg bgbcaaaaga ccactggcga 
1920 

baggbctbag bbbbgbacga tcaaccbgga babaaaagaa abbbgbaaga caacataabc 
1980 

baaaacaaaa caaccabaca aaabcbbgag cbbbacabac aagcaaccca bcbttgtbba 
2040 

bggaagaatg aabccagbba cabgaabgcb gbgbabcbac ccbaacbacb aaacacabat 
2100 

bbcaabcgaa aaacababbc caccbbcacc ababcbaaca ccbgaagtcb bbcactbbbb 
2160 

gaacgaagbc abcagaacab gcagabaagc babbacccaa aacagagaba bgacbggaaa 
2220 

bgbbgbcgba aabbgabcca acabagaaaa abcaagacca gbbccagabg bcaaagcaab 
2280 

aacacbbbcc caccabggbb acagaaacca bagbbacaca aaacabgbbt ccbaaaccaa 
2340 

cabacbaaag ggabababaa abbbgacabc acbbbabcac cabaccabaa gabagcbtaa 
2400 

aaacaaacbg accbbbgbab cbabgbcctg abcaagcaga bcatbbabag bacaaccagc 
2460 

accbcbaaga agbaabgcbc cgcaaccaaa baaagccaba babbbaaaac bbggaaggcb 
2520 

bccaggatca gcagccaacg caabcgaccb abacaacaab gabggagatt cagagbabcg 
2580 

abcbabbbac abagcbctgg aacbagabcc abgacgaaac abggaacatc gbbabaabab 
2640 

cbaaagactb ccaaacagab bccbgagbaa gaaacccagb ggaacbabag bacbgbaaca 
2700 

bababaaaab caaagaaaac bcaggbbbab agcabbabcc aabccbgatb bcbgccaabc 
2760 

cbbaaccacb ctcccabgcb abcaaaaacc bcagcbcaag abcabacbac cbaattgccb 
2820 

abgagcbctb gggaagabca bbabggabtb gabaacbgaa aaaagbaaca gagaaabagc 
2880 

agacbgcaag aacbacbcca aacbbcbcca cbgababgba bgbagbcbaa caabaabaaa 
2940 

cagacabaaa bbcbbbbabc aagcbbcaag agcaagbbag bcagaaaaca bcacagccaa 
3000 

accaaccagg aaaacacaba acbbbabcac abaaaacbaa abbbaabgba abcbgacbba 
3060 

acabaaacca bccbbbggga cgaaaggaaa cbababaaac abgcagbcbb fccbtbcccbc 
3120 
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agctattctt tcggatggat tataatgaat ctcaaaagtg aaatgtcttg attctcagct 
3180 

acattactca aaggcgaaga taaacttacc acatacaagg ccacgcaagc. aaccaagttc 
3240 

caatgggttt atccaatcga gcaagctfcag cataacctct aacttcttct ggtaaataca 
3300 

aatctatcca agaagcttcc ttaacaacaa caccatcact cttctcctta tcakctttct 
3360 

tcggcfcfctcc ctccaaaacc gaagaagacg acgacabtcc acaaabtaat cbgtaattcc 

3420 ^ u 

aaccaacacc aaaaaactbc tccbgatgca abtcbcbbcc bbtacbccat acbbggbaat 

3480 

tabcabbcca bgaaggabaa cacbbagbga aaggattbgb gbaatgggta gbcacaggab 
3540 

bggacaagga bbbabgbbgb gatbgcaaaa gagcagagga agaagabgga gtbacggaga 
3600 

cggaagabbb caacaaccgb cbbgaaacac gggagagccc aaaaaacgcc abcbtbgaga 
3660 

gaaabtgbbg ccbggaagaa acaaagactb gagatttcaa acgbaagtga abtcbbacga 
3720 ^ ■ 

acgaaagcta actbcbcaag agaabcagab bagtgabbcc tcaaaaacaa acaaaacbat 

3780 ' . 

ctaabbbcag btbcgagbga tgaagccbta agaatcbaga acctccatgg cgbbtctaab 

3840 

cbctcagaga baatcgaatt ccbtaaacaa bcaaagcbta gaaagagaag aacaacaaca 
3900 

acaacaaaaa aaabcagabt aacaaccgac cagagagcaa cgacgacgcc ggcgagaaag 
3960 ^ 

agcacgbcgb cbcggagcaa gacbbcbbcb ccagbaaccc ggabggabcg ttaabgggcc 

4020 

bgtagabbab batabbbggg ccgaaacaab bgggbcagca aaaacttggg ggataatgaa 
4080 ^ ^ 

gaaacacgba cagtabgcat tbaggcbcca aabtaatbgg ccatabaabb cgaatcagab 

4140 ^ 
aaacbaabca acccctacct bacbbabbtc bcacbgbtbb babttctacc btagtagtbg 

4200 

aagaaacacb tbbatbtatc tbbbcgggac ccaaabtbga baggatcggg ccabtactca 
4260 

bgagcgbcag acacabatta gccbbatcag abtagbgggg taaggtbbtb bbaabtcggb 

4320 . . ^ 

aagaagcaac aabcaabgtc ggagaaabba aagaatcbgc abgggcgtgg cgbgatgata 

4380 

bgbgcababg gagtcagttg ccgatcatab abaactabbb abaaactaca tataaagacb 
4440 

actaabagab 
4450 

<210> 93 
<211> 2850 
<212> DNA 

<213> Arabidopsis sp 
<400> 93 

aabtaaaabb bgagcggbcb aaaccabbag accgbtbaga gabccctcca acccaaaaba 
60 

gbcgabbbbc acgbcbbgaa cabababbgg gccbtaabcb. gbgbggtbag baaagactbt 

120 ^ ^ 

babbggbcaa agaaaaacaa ccabggccca acabgbbgab actbbbabbb aabbatacaa 

180 

gbaccccbga abtcbcbgaa abababbbga bbgacccaga tabbaabbbb aabtabcabt 
240 
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tcctgtaaaa gtgaaggagt caccgtgact cgtcgtaatc tgaaaccaat ctgttcatat 
lljgaagaag tttctctcgt tctcctccaa cgcgtagaaa attctgacgg cttaacgatg 
tggcgaagat ctgttgttta tcgtttctct tcaagaatct ctgtttcttc ttcgttacca 
aaccctagac.tgattccttg gtcccgcgaa ttatgtgccg ttaatagctt ctcccagcct 
|cggtctcga cggaatcaac tgctaagtta gggatcactg gtgttagatc hgatgccaat 
cgagtttttg ccactgctac tgccgccgct acagctacag ctaccaccgg tgagatttcg 
tctagagttg cggctttggc tggattaggg catcactacg ctcgttgtta ttgggagctt 
tctaaagcba aacttaggta tgtgtttact tttcbtttct catgaaaaat ctgaaaattt 
ccaattgttg gattcttaaa ttctcatttg ttttatggtt gtagtatgct tgtggttgca 
acttctggaa ctgggtatat tctgggtacg ggaaatgctg caattagctt cccggggctt 
tgttacacat gtgcaggaac catgatgatt gctgcatctg ctaattccfct gaatcaggtc 
|ttgaaatgt tgagaagttc ataaatttcg aatccttgtt gtgtttatgt agttgatctt 
gcttgcttat gtttatgtag ttgaaaagt.t taaaaatttc haatccttgg tagtbgatct 
cgcttgtttg ttttttcatt ttctagattt ttgagataag caatgattct aagatgaaaa 
gaacgatgct aaggccattg ccttcaggac gtattagtgt tccacacgct gttgcatggg 
ctactattgc tggtgcttct ggtgcttgtt tgttggccag caaggtgaat gtttgttttt 
ttatatgtga tttctttgtt ttatgaatgg gtgattgaga gattatggat ctaaactttt 
gcttccacga caaggttatt gcagactaat atgttggctg ctggacttgc atctgccaat 
cttgtacttt atgcgtttgt ttatactccg ttgaagcaac ttcaccctat caatacatgg 
gttggcgctg ttgttggtgc tatcccaccc ttgcttgggt aaatttttgt tccttttctt 
ctttatttta gcagattctg ttttgttgga tactgctttt aattcaaaat gtagtcatgg 
ttcaccaatt ctatgcttat ctattttgtg tgttgtcagg tgggcggcag cgtctggtca 
gatttcatac aattcgatga ttcttccagc tgctctttac ttttggcaga tacctcattt 
tatggccctt gcacatctct gccgcaatga ttatgcagct ggagggtaag accatatggt 
gtcatatgag attagaatgt ctccttccat gtagtgttga tcttgaacta gttcaatttc 
gtggaatgat cagagtgtcc tagatagtgt cacagcagtc gacattttag tggctagata 
atgagttctfc tccgttagag ataaacattc gcgaacattg tttccagctt ccgcgaccca 
actfcctgatt ttgtttcttg gtaccttgtt ttcagttaca agatgttgtc actctttgat 
ccgbcaggga agagaatagc agcagtggct ctaaggaact gcttttacat gatccctctc 
ggtttcatcg cctatgactg tgagtcttgt agattcatct ttttttfcgta gtttattgac 
tgcattgcbg tatctgabtt tbgctgttcc btccaatttt tgbgacagoo gggbtaacct 



41/44 



wo 02/33060 PCT/DSOl/42673 
2100 

caagttggtt tbgcctcgaa tcaacacttc tcacactagc aatcgctgca acagcatttt 
Ijttctaccg agaccggacc atgcataaag caaggaaaat gttccatgcc agtcttctct 
tccttccbgt tttcatgtct ggt.cttcttc tacaccgtgt ctctaatgat aatcagcaac 
aactcgtaga agaagccgga ttaacaaatt ctgtatctgg tgaagtcaaa actcagaggc- 
gaaagaaacg tgtggctcaa cctccggtgg cttabgcctc bgctgcaccg tbtcctttcc 
tcccagctcc tbcctbcbac tcbccatgat aacctbtaag caagctattg aatbtttgga 
aacagaaatb aaaaaaaaaa tcbgaaaagt bcbtaagbbb aabcbbtggt taataabgaa 
gtggagaacg cabacaagbt babgbabbtt btcbcabcbc cacabaatbg tattbtbtct 
cbaagtabgt bbcaaatgat acaaaataca bacttbabca abtatctgab caaabbgabg 
aabbbtbgag ctbbgacgbg btaggtcbat cbaabaaacg bagbaacgaa ttbggbbbbg 
gaaatgaaab ccgataaccg abgabggbgt agagtbaaac gabbaaaccg ggtbggttaa 
aggbcbcgag bcbcgacggc bgcggaaabc ggaaaatcac gatbgaggac btbgagcbgc 
|^|9^«93tg gcgatgaggb bgaaatcaab 

<210> 94 
<211> 3660 
<212> DNA. 

<213> Arabidopsis sp 
<400> 94 

tatttgtatt tttattgtta aattbtatga tttcacccgg tatababcab cccababbaa 
tatbagattb atbbtttggg cbbbattbgg gbbtbcgabb baaactgggc ccattcbgct 
tcaabgaaac ccbaabgggt bbbgttbggg cbbbggabbb aaaccgggcc cattcbgcbb 
caatgaaggb ccbttgtcca acaaaactaa catccgacac aactagbatt gccaagagga 
tcgbgccaca bggcagbtab bgaatcaaag gccgccaaaa cbgbaacgta gacatbactb 
abcbccggta acggacaacc acbcgtbbcc cgaaacagca acbcacagac tcacaccacb 
ccagtctccg gctbaacbac caccagagac gatbcbctcb bccgtcggbt ctatgacbbc 
gatbcbcaac actgtctcca ccabccacbc ttccagagbb acctccgtcg abcgagtcgg 
agtccfccbcb cbbcggaatb cggabtccgt tgagtbcacb cgccggcgbt ctggtbbcbc 
|acgbbgabc bacgaabcac ccggtagtba gcabbcbgbb ggabagabbg abgaabgbbb 
tcbbcgatbb bttbtbtacb gabcbbgtbg tggatcbcbc gbagggcgga gabbbgtbgt 
gcgbgcggcg gagacbgaba cbgabaaagg babgabbbbb bagbbgbbbb babbbbcbcb 
cbcttcaaaa btctctbbbc aaacacbgtg gcgttbgaat bbccgacggc agttaaabcb 
cagacaccbg acaaggcacc agccggtggb tcaagcabba accagcttct cggbabcaaa 
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ggagcatctc aagaaactgt aattttgttc atctcctcag aatcttttaa attafccatat 
ttgtggataa tgabghgtta gtbtaggaat tttcctacta aaggtaatct cbtttgagga 
caagtcttgt httbagctta gaaatgatgb gaaaabgbbg bbbgbbagcb aaaaagagbb 
tgbbgbtata bbcbgbabtc agaabaaatg gaagabbcgb cbbcagcbta caaaaccagb 
cacbbggccb ccactggbbb ggggagtcgt cbgbggbgcb gcbgcbbcag gbaabcabac 
gaaccbcbbb bggabcabgc aabacbgtac agaaagbbbb bbcabbbtcc tbccaabtgb 
ttcbbctggc agggaacbbb cabbggaccc cagaggabgb bgctaagbcg atbcbttgca 
tgabgabgbc bggbccbbgb cttacbggcb atacacaggt ctggtbbbac acaacaaaaa 
gcbgactbgt tctbatbcta gtgcabbtgc bbggbgctac aabaaccbag acbtgbcgat 
bbccagacaa bcaacgacbg gtabgataga gababcgacg. caatbaabga gccababcgb 
ccaabbccab cbggagcaab atcagagcca gaggtaactg agacagaaca bbgbgagcbb 
ttatctcbbt tgbgatbcbg attbctccbb acbccbbaaa abgcaggtba tbacacaagb 
cbgggbgcba ttabbgggag gtcbbggtab bgcbggaaba tbagabgtgb gggtaagbbg 
gcccbbcbga catbaacbag bacagbbaaa gggcacatca gabtbgcbaa aatctbcccb 
batcaggcag ggcataccac bcccactgbc bbcbabcbbg cbbbgggagg atcabbgcba 
bctbabatab acbcbgcbcc accbcbtaag gbaagbbtba btccbaactt ccactcbcba 
gbgabaagac acbccatcca agtbbbggag bbttgaabab cgababcbga acbgabcbca 
bbgcagcbaa aacaaaabgg abgggbbgga aabbbtgcac bbggagcaag cbababbagt 
bbgccabggb aagababcbc gbgbabcaab aatabatggc gbbgbbctca bcbcabbgab 
ttgbbbcbbg ctcacbtgac tgabaggbgg gcbggccaag catbgtbbgg cactcttacg 
ccagabgtbg bbgbbctaac actcbbgtac agcatagcbg gggbacbcbb btggcaaacc 
ttttabgbbg cbbbttbcgb batcbgbbgb aababgctcb tgcbbcabgb tgtacctbbg 
tgataabgca gbbaggaaba gccabbgbba acgacbbcaa aagbgbbgaa ggagabagag 
cabbaggacb tcagbcbcbc ccagbagctt bbggcaccga aactgcaaaa bggabatgcg 
tbggbgctab agacabbacb cagcbbbctg bbgccggbab gbacbabcca ctgbbtbbgb 
gcagcbgbgg cbbcbabbbc bbtbccbbga bcbbabcaac bggababbca ccaahggbaa 
agcacaaabb aabgaagcbg aabcaacaaa ggcaaaacab aaaagbacab bcbaatgaaa 
t|agcbaabg aagaggaggc abcbacbtbb abgbbbcabb agbgbgabbg atggabtbbc 
abbbcabgcb bcbaaaacaa gbabbbbcaa cagbgbcabg aaataacaga acbbababcb 
bcabtbgbac bbbbacbagb ggabgagbba cacaabcabb gtbatagaac caaabcaaag 
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gtagagatca tcattagtat atgtctattt tggttgcagg atatctatta gcatctggga 
aaccttatta bgcgttggcg tbggttgctt tgatcattcc tcagattgtg ttccaggtaa 
agacgbtaac agtctcacat tataattaab caaabtcbbg fccactcgtct gabbgcbaca 
cbcgcbbcba baaactgcag bbbaaabacb bbcfccaagga cccbgtcaaa tacgacgbca 
agtaccaggb aagbcaacbt agbacacatg ttbgbgttct bbtgaaatab cttbgagagg 
tcbctbaabc agaagtbgcb bgaaacacbc atcbtgabta caggcaagcg cgcagccabb 
cbbggbgcbc ggaababttg baacggcatb agcatcgcaa cacbgaaaaa ggcgtabbbt 
gabggggbbb bgbcgaaagc agaggtgttg acacatcaaa tgbgggcaag tgatggcatc 
aactagtbba aaagabbtbg taaaatgbat gbaccgttab bactagaaac aacbccbgbt 
gbabcaabbb agcaaaacgg cbgagaaatb gtaabtgabg bbaccgtabb tgcgcbccat 
tbbbgcabtb ccbgcbcaba bcgaggattg gggbbbabgt tagbbctgtc acbtcbcbgc 
bbtcagaabg btbtbgbbtt ctgtagtgga bbttaacbat tbtcatcact bbbtgbattg 
abbctaaaca bgbabccaca taaaaacagb aatatacaaa aatgatacbb ccbcaaactb 
tbbabaabcb aaabctaaca actagctagb aacccaacba acttcataca atbaabttga 
gaaacbacaa agacbagact abacababgb batbbaacaa ctbgaaactg bgbtatbacb 
accbgatttb bbbcbatbcb acagccabtt gababgctgc aabcbbaaca babcaagtcb 
cacgbtgbbg gacacaacab actatcacaa gtaagacacg aagbaaaacc aaccggcaac 
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